Skip to main content
All CollectionsAutograders Creating Autograders
Creating a General Python Autograder
Creating a General Python Autograder

Learn how to write an autograder for Python and implement it with your own custom coding exercises

Gary Gould avatar
Written by Gary Gould
Updated over a week ago
Banner: This is a free feature.

Contents Overview

This is an advanced guide to creating an autograder on the CodeHS platform. This guide assumes that you know how to add an autograder and how an autograder fundamentally works. The guide details how to program an autograder and the what tests are available.

Boiler Plate Autograder

This is what you see when you first add an autograder to a program. It already contains examples of how to write tests:

Screenshot of the autograder code with information on the test class is structured.

The Parameters

  • All parameters of these functions are Strings (other than self, which is the calling object)

  • student_code is the text of the student’s program

  • solution_code is the text of the solution program

  • student_output is the text of the student’s output given the input (if there is any)

    • VERY IMPORTANT: this does not contain any text that was printed using input

  • solution_output is the text of the solution program’s output given the input (if there is any)

Things to keep in mind:

  • If you want to pass more than one set of inputs to the program, you need to create another class. If you don’t (or there is no input), then you can put all of your tests in the default class.

  • To access the input list in the functions, use self.inputs

  • While inputs can hold any data type, it’s best to make it a list of strings. There are errors sometimes when students print numerical data given by inputs without casting it to a string (for instance, by doing print(some_input))

  • It's a good idea to put any code analysis tests in before_run, and then use after_run to check the student’s output.

  • Don’t forget to delete the pass statement if you implement the function

Tests That Can Be Run

You can see the full test options here. To create a test, use expect( … ) and then call one of the following test methods.

Let res be the student’s output/code.

Summary of Expectation Functions

to_contain(expected)

  • Checks that the string expected is contained in res

  • code: expect(res).to_contain(expected)

  • Example: check if the student has a while loop

    • expect(student_code).to_contain(“while”)

not_to_contain(expected)

  • Checks that the string expected is NOT contained in res

  • Code: expect(res).not_to_contain(expected)

  • Example: checks the student removed a placeholder pass statement

    • expect(student_code).not_to_contain(“pass”)

to_be_less_than_or_equal_to(value)

  • Checks res <= expected

  • Code: expect(res).to_be_less_than_or_equal_to(expected)

  • Example: checks the student used no more than 4 for loops

    • num_fors = student_code.count(“for”)expect(num_fors).to_be_less_than_or_equal_to(4)

to_be_greater_than(value)

  • Checks res > expected

  • Code: expect(res).to_be_greater_than(expected)

  • Example: checks the student printed at least 4 lines

    • lines = student_output.split_lines() # creates a list of linesexpect(len(lines)).to_be_greater_than(3)

to_be_less_than(value)

  • Checks res < expected

  • Code: expect(res).to_be_less_than(expected)

  • Example: checks the student used no more than 4 for loops

    • num_fors = student_code.count(“for”)expect(num_fors).to_be_less_than(5)

to_be_greater_than_or_equal_to(va)

  • Checks res >= expected

  • Code: expect(res).to_be_greater_than_or_equal_to(expected)

  • Example: checks the student printed at least 4 lines

    • lines = student_output.split_lines() # creates a list of linesexpect(len(lines)).to_be_greater_than_or_equal_to(4)

to_equal(value)

  • Checks res == expected

  • You usually want to use this one and not to_be(...)

  • Code: expect(res).to_equal(expected)

  • Example: checks the student printed the correct first line

    • lines = student_output.split_lines()expect(lines[0]).to_equal(“My Fun Program”

not_to_equal(value)

  • Checks res != expected

  • Code: expect(res).not_to_equal(expected)

  • Example: checks the student changed the last line of output

    • lines = student_output.split_lines()expect(lines[-1]).not_to_equal(“Change this output”)

to_be_truthy()

  • Checks res == True

  • Takes no parameters; useful for custom tests or checking multiple things

  • Code: expect(res).to_be_truthy()

  • Example: checks the student used a for loop and did not use a while loop

    • used_for = “for” in student_codenot_use_while = “while” not in student_codeloops_right = used_for and not_use_whileexpect(loops_right).to_be_truthy()

to_be_falsey()

  • Checks res == False

  • Takes no parameters; useful for custom tests or checking multiple things

  • Code: expect(res).to_be_falsey()

  • Example: checks the student changed the starter code

    • code_changed = student_code == “# Put code here”expect(code_changed).to_be_falsey()

to_be(some_obj)

  • Checks that res is the same as the object expected

  • Note this is not the same as res == expected; this is res is expected

  • Code: expect(res).to_be(expected)

  • This works for strings because of the way Python creates and stores strings

not_to_be(some_obj)

  • Checks that res is not the same as the object expected

  • Code: expect(res).not_to_be(expected)

Test Options

With each of the previous methods, you can then call with_options() that will customize the following:

  • test_name: the test name displayed to the student

    • defaults to a string representation of the test

      • for example, something like “Expected “print(“hello world’)” to contain “print””

  • message_pass: Message displayed if the test passed

    • defaults to nothing

  • message_fail: Message displayed if the test failed

    • defaults to nothing

  • student_output: What is shown as the student output (labeled “your result”)

    • defaults to what’s passed to expect

  • solution_output: What is shown as the solution output

    • defaults to empty

  • show_diff: Shows the difference between the student’s output and the solution output

    • defaults to False

    • Note: unless the program only passes with very specific formatting, students usually find this more confusing than helpful

Test Options Examples

  • Example that checks the student used a for loop and did not use a while loop

    • used_for = “for” in student_codenot_use_while = “while” not in student_codeloops_right = used_for and not_use_whileexpect(loops_right).to_be_truthy().with_options( test_name = “You should use a for loop for this program”, message_pass = “Great!”, message_fail = “You should not use a while loop!”, student_output = student_code)

    • Note: it’s a good idea to set the student_output here; otherwise, student_output would be the value of loops_right

  • Example that checks the student changed the last line of output

    • lines = student_output.split_lines()expect(lines[-1]).not_to_equal(“Change this output”).with_options( test_name = “You should customize the last line of output”, message_pass = “Great!”, message_fail = “Check your last line!”)

    • Note: In this case, student_output will be the last line that the student printed

Tests With Input

To set the input to a program, put the values in the input list.

The program will be run once with the given input. If you want to test a different set of input, you need to write another test class (see examples below).

The input should be given as strings. This minimizes the number of weird errors that occur for the student. To use the inputs list in the before_run or after_run functions, use self.inputs.

Example

Problem: The student is supposed to ask the user for a number of feet and number of inches, then print out the number of inches.

Test Cases: 3 ft, 4 inches → 40 inches

0 ft, 6 inches → 6 inches

12 ft, 1 inches → 145 inches

The Autograder:

Screenshot of the autograder code covering the first two test cases above.
Screenshot of the autograder code for the last test case.

Some things to Note:

  • inputs are lists of strings

  • Each set of input needs its own class (Suite, Suite01, Suite02)

    • You can name these classes anything you want; just be sure to inherit from PythonTestSuite

  • Each class needs to be created at the bottom (lines 57 - 59)

  • The tests are written in such a way that once the first one works, you can copy and paste the entire class -- the only thing you have to change is the inputs list and the class name!

  • self.inputs refers to that class’s inputs list

  • I used format with the strings because I think it looks cleaner. String concatenation works just as well.

  • student_output isn’t set in the test options since it will default to student_output

    • since student_output is the parameter for expect

  • Since solution_output isn’t set in the test options, the expected output won’t be shown to students

General Tips

  • You very, very rarely want to use expect(student_output).to_equal(solution_output)

    • It’s better to just look for key pieces

  • To debug your tests, set student_output or solution_output to see the values of your variables

    • Remember these have to be string values

  • String comparisons are case sensitive and whitespace sensitive

    • use str.lower() to make str all lowercase

    • use str.replace(‘ ‘, ‘’) to remove spaces (but not all whitespace)

      • first parameter is a single space, second is an empty string

    • use “”.join(str.split()) to remove ALL whitespace

      • “” is an empty string

  • You can use strip_comments(str) to get rid of any comments from the code

    • helpful for when you’re checking that they didn’t use for loops for instance

    • (background story: there was a program that was failing because it said the for loop was incorrect. Turns out, the autograder was looking at a ‘for’ it had found in a comment, and missed the one in the code)

Common Issues:

  1. Many autograders are checking for specific text inside a student’s code (syntax). Many don’t allow for different spacing schemes.

    1. In this scenario, the following function can be used to strip the spaces from the student code before checking for the text:

# Takes a string and removes all whitespace (spaces, tabs, new lines, etc.) def remove_whitespace(my_string): return ''.join(my_string.split())
  1. This line could also be called on the string variable:

cleaned_variable = string.replace(' ', '')

Common Checks:

  1. Save expected output in a variable and use in test name

  2. Multiple checks using ‘and’

  3. Checking expected output against student output

def after_run(self, student_code, solution_code, student_output, solution_output): expected = 'There is an a in your name, first found at index 2' student_output = student_output.lower() output_correct = 'is an a' in student_output and 'index 2' in student_output expect(output_correct).to_be_truthy().with_options( test_name='If my name is "Tracy", you should say %s' % expected, student_output=student_output, )

Still have questions? Contact our team at hello@codehs.com to learn more!

Did this answer your question?