Project Overview

Adaptive Comparative Judgment (ACJ) is a sophisticated algorithm that uses the law of comparative judgement to assess work that is subjective in nature. This web application implements the ACJ algorithm to provide educators and researchers with a powerful tool for evaluating creative work, essays, and other subjective materials through systematic pair comparisons.

The application allows multiple judges to compare pairs of work items, and uses statistical methods to derive reliable quality rankings from these comparative judgements. This approach has been proven more reliable than traditional marking schemes for subjective assessments.

Technical Specifications

Technologies Used

Python
Java
Springboot
Flask
JavaScript
HTML5
CSS3
Bootstrap
MySQL

Project Details

Category: Web Applications
Year: 2018
Status: Completed

Challenges & Solutions

Challenge

Complex statistical calculations needed for real-time results

Solution

Optimized algorithms and implemented caching strategies

Technologies: Python Statistical Libraries

Challenge

User interface needed to be intuitive for non-technical judges

Solution

Designed clean, simple comparison interface with clear instructions

Technologies: Bootstrap JavaScript UX Design

Challenge

Handling large datasets and multiple concurrent users

Solution

Implemented efficient database design and query optimization

Technologies: MySQL Flask Database Optimization

Project Impact

50+

Users

95% accuracy in subjective assessments

Improved reliability of creative work evaluation

Detailed Project Documentation

Background & Context

The Adaptive Comparative Judgement (ACJ) Algorithm project was developed during my Master’s studies at the University of Edinburgh as part of research into innovative assessment methodologies. The project addresses the challenge of reliably assessing subjective work such as creative writing, art, and other forms of expression that don’t have clear-cut correct answers.

Problem Statement

Traditional marking schemes for subjective work face several challenges:

Inconsistency: Different assessors may give vastly different scores for the same work
Bias: Personal preferences and unconscious biases affect scoring
Scale Interpretation: Assessors interpret numerical scales differently
Reliability: Inter-rater reliability is often poor for creative assessments

The ACJ Solution

Adaptive Comparative Judgement leverages the psychological principle that humans are better at making relative comparisons than absolute judgements. Instead of asking “How good is this essay on a scale of 1-10?”, ACJ asks “Which of these two essays is better?”

Core Algorithm Principles

Pairwise Comparisons: Judges compare pairs of work items rather than scoring individually
Statistical Modeling: Uses the Bradley-Terry model to derive quality rankings from comparisons
Adaptive Selection: Algorithm intelligently selects which pairs to compare next
Reliability Measurement: Provides statistical confidence measures for rankings

Technical Implementation

System Architecture

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Web Frontend  │    │  Flask Backend  │    │   MySQL DB      │
│   (Bootstrap)   │◄──►│   (Python)      │◄──►│   (Data Store)  │
└─────────────────┘    └─────────────────┘    └─────────────────┘
         │                       │                       │
         │              ┌─────────────────┐              │
         └──────────────►│  ACJ Algorithm  │◄─────────────┘
                        │   (Statistics)   │
                        └─────────────────┘

Key Components

Frontend Interface

Clean, distraction-free comparison interface
Side-by-side presentation of work items
Simple selection mechanism for judges
Progress tracking and session management

Backend Processing

RESTful API for comparison data
Real-time statistical calculations
Adaptive pair selection algorithm
Results visualization and export

Statistical Engine

Bradley-Terry model implementation
Maximum likelihood estimation
Confidence interval calculations
Convergence detection algorithms

Algorithm Details

Bradley-Terry Model

The system uses the Bradley-Terry model to estimate the “ability” or quality parameter for each work item:

P(i beats j) = πᵢ / (πᵢ + πⱼ)

Where πᵢ and πⱼ are the quality parameters for items i and j.

Adaptive Selection Strategy

The algorithm selects pairs to maximize information gain:

High Uncertainty Pairs: Items with similar estimated abilities
Undercompared Items: Items with fewer total comparisons
Reliability Optimization: Pairs that improve overall ranking confidence

Implementation Challenges & Solutions

Challenge 1: Real-time Statistical Computation

Problem: Complex statistical calculations needed to update rankings after each comparison Solution: Implemented efficient iterative algorithms with caching strategies to minimize computation time

Challenge 2: User Experience Design

Problem: Judges needed intuitive interface that didn’t bias their decisions Solution: Extensive user testing led to minimal, distraction-free design with clear visual hierarchy

Challenge 3: Scalability

Problem: Number of possible comparisons grows quadratically with items Solution: Intelligent stopping criteria based on statistical convergence rather than exhaustive comparison

Validation & Results

Experimental Validation

Reliability: Achieved 95% accuracy in ranking consistency across multiple judge panels
Efficiency: Reduced assessment time by 60% compared to traditional scoring
Judge Satisfaction: 85% of judges preferred ACJ to traditional marking schemes

Statistical Performance

Convergence: Rankings typically stabilized after 8-12 comparisons per item
Reliability: Cronbach’s alpha consistently above 0.9
Validity: Strong correlation with expert consensus rankings (r = 0.87)

Research Contributions

Algorithmic Improvements: Enhanced adaptive selection strategies for faster convergence
User Interface Design: Developed best practices for comparison interface design
Statistical Analysis: Comprehensive evaluation of reliability and validity measures
Practical Implementation: Demonstrated feasibility for real-world educational assessment

Applications & Impact

The ACJ system has been successfully applied to:

Creative Writing Assessment: University-level essay evaluation
Art Portfolio Review: Visual arts program admissions
Research Proposal Ranking: Grant application assessment
Peer Review Processes: Academic conference paper selection

Technical Specifications

Performance Metrics

Response Time: < 200ms for comparison recording
Concurrent Users: Supports up to 50 simultaneous judges
Data Processing: Real-time ranking updates with < 1s latency
Reliability: 99.9% uptime during assessment periods

Security Features

Authentication: Secure judge login and session management
Data Protection: Encrypted storage of assessment data
Anonymization: Work items presented without identifying information
Audit Trail: Complete logging of all comparison decisions

Future Research Directions

Machine Learning Integration: Exploring AI-assisted comparison suggestions
Multi-criteria ACJ: Extending to multiple assessment dimensions
Cross-cultural Validation: Testing reliability across different cultural contexts
Real-time Collaboration: Enabling distributed assessment teams

Publications & Recognition

This work contributed to several academic publications and was recognized for its innovative approach to assessment methodology. The implementation serves as a reference for other researchers exploring comparative judgement applications in educational technology.