Student Evaluation Practices and Assessment Strategies

AI is going to drastically change how faculty perceive assessment and grading (Young, 2023), requiring them to rethink learning outcomes, redesign assignments (Stanford, 2023), and also consider more progressive approaches to student learning instead of more traditional methods.

This statement from CJ Yeh (Fashion Institute of Technology Professor of Communication Design Foundation) and Christie Shin (Fashion Institute of Technology Associate Professor of Communication Design Foundation) describes some of the changes that will need to be made within the field of design education:

We will need a greater focus on interdisciplinary collaboration. In order to solve the increasingly complex problems that contemporary society is facing today, it is critical for aspiring designers to learn how to collaborate effectively with developers, engineers, and other stakeholders. This means students will need to communicate effectively, share ideas, and work together to achieve common goals. Some key learning objectives would include the following:

    1. Critical thinking and problem framing: AI can accomplish many tasks, but it cannot replace creativity, critical thinking, and (most importantly!) empathy. Students need to learn how to use these skills to accurately define problems and come up with new solutions.
    2. Cloud-based remote collaboration: These tools are essential for designers who want to work efficiently and effectively with team members who are located in different places and other fields. Designers can share files, communicate in real time, and track progress on projects from the comfort of their own homes or offices.
    3. AI-assisted design process: Students need to learn how to use AI technologies, including using AI to automate tasks, generate ideas, and test designs.
    4. Ethics and social responsibility: We must stop focusing on simply teaching students how to create the most persuasive ads, seductive designs, addictive games, etc. The next generation of designers needs to learn about the ethical implications of design and social responsibility. This includes learning about privacy, accessibility, and sustainability.

AI’s Impact on Summative Assessment: An Example

In an Alchemy webinar titled “Harnessing the Power of AI: Transforming Assignments and Assessments in Higher Education, Dr. Danny Liu (University of Sydney) discussed the importance of designing authentic assessments (Villarroel et al., 2017) and the importance of feedback (Carless & Boud, 2018).

The Villarroel et al. study suggests that faculty make assessment more like real-world tasks students might encounter in a future job. Students tend to learn better, feel more motivated, and feel like they are managing their own learning. The study suggests a step-by-step model to help faculty create their own authentic assessments in higher education.

Carless and Boud discuss student feedback literacy, which is how students are able to understand and use feedback to improve their work and learning. The paper focuses on how students respond to feedback and some challenges they face when applying feedback. Carless & Boud offer two activities that can help students improve their feedback literacy: giving feedback to each other and analyzing examples of good work.

Dr. Liu suggests a Two-Lane Approach in regard to assessment strategies with all of this in mind: how it’s important to have some kind of “Lane 1” (read: traditional assessment to ensure learning outcomes are being met) approach, but how “Lane 2” would factor in the authentic assessment that students would be more motivated to complete. He uses this example  in his presentation to demonstrate the approach:

Short and Longer Term Assessment Strategies
Lane 1: Assurance of Learning Outcomes Lane 2: Human-AI Collaboration
Short term:
  • In-person exams/tests
  • Viva voces [oral exams]
Short term:
  • Students use AI to brainstorm, draft outlines, summarize resources, perform research
  • Students critique AI responses
Longer term:
  • In-class contemporaneous assessment
  • Interactive oral assessments
  • In-person exams/tests (sparingly)
Longer term:
  • Students collaborate with AI and document this process; the process is graded more heavily than the product

The idea is to try to find balance between traditional assessment methods and new ways to assess student learning by encouraging their collaboration with AI. Dr. Liu provided an example from a marketing class.

Example of a Two-Lane Approach

Learning outcomes: apply marketing strategy concepts in real-world scenarios; demonstrate communication skills; evaluate effectiveness of different strategies.

Further Assessment Strategies
Lane 1: Assurance of Learning Outcomes Lane 2: Human-AI Collaboration
Live Q&A after in-class presentation (defend research/analysis, etc.)

Giving students unseen case study in a live unsupervised setting

Bing Chat for market research and competitor analysis

Adobe Firefly for campaign design

Collaboration process is documented (fact-checking, improving, critiquing)

In-class presentation

Process heavily weighted

In this example, the Lane 2 approach has more components as well as several opportunities for interaction with AI technology. Bing Chat is an AI-powered search engine, Adobe Firefly is an AI that can generate images, and students would have the opportunity to use other AI tools that could help generate text.

Process plays a big role in Dr. Liu’s scenario (see the process book assignment in the next section), and there’s more at stake for students in the Lane 1 assessment.

Alternative Grading Strategies

Alternative grading strategies that have become more popular over the last several years  may help faculty think about evaluation in new ways and can reduce students’ perceived need to use generative AI tools inappropriately. These include specifications grading, contract or labor-based grading, and ungrading. Each method is summarized below, along with links to additional information.

Specifications grading

Instructors create assignments with clearly specified requirements and assignments either meet the criteria or they don’t. Revision opportunities are built in.

Contract grading/labor-based grading

Students and instructors agree to a contract in which each grade is tied to a set of criteria like allowed absences, the number of drafts or assignments completed in a satisfactory manner, and the number of reading responses submitted over the semester.

Ungrading

The instructor specifies learning objectives, and self-reflection is used regularly for students to self-assess their progress (in reflective journals, blogs, etc.). Instructors provide students with regular feedback, and midterm or final grades are determined by consultation between the instructor and the students.

Challenges with AI Detection Products

When AI turned into a buzzword early in 2023, there was a lot of discussion about different AI detectors and their effectiveness, including one called GPTZero. Some of these tools claim to be up to 99% accurate, but AI has also suggested that human-generated text is the result of chatbots when it is not. In June, Turnitin publicly acknowledged that its software has a higher false positive rate than the company originally stated. In July, OpenAI pulled its detection tool, AI Classifier, because of its “low rate of accuracy” (Nelson 2023). False positive results can have negative impacts for students, as seen in the example of a Texas A&M professor who suspected his students were using AI to cheat on their final essays. He copied essays into ChatGPT to determine whether or not his students were cheating and gave out incomplete grades to students in his class, which caused serious problems for graduating seniors, including many who had in fact not used AI on their assignments.

In addition to the false positives, many AI detectors are biased against non-native writers, as discussed in this paper by Liang et al. (2023). The book AI for Diversity, by Roger Søraa discusses a wide range of bias in varied ways, including gender, age, sexuality, etc.

There are also some opinions that it will be easy to “catch” students who use AI tools because AI technology doesn’t sound human. While that may have been the case early on, these language models improve each time someone plugs in a new prompt. This article in the Chronicle got a lot of attention a few months ago when a student described how many of their peers were using this technology and challenging the notion about academic integrity policies. Consider how the story begins: “Submit work that reflects your own thinking or face discipline. A year ago, this was just about the most common-sense rule of Earth. Today, it’s laughably naive” (Terry, 2023). Faculty need to assume that  at least some students are going to seek out this technology.

Some faculty members may choose a more hands-on approach to AI-generated work. For example, if they suspect a student has used AI to produce work for an assignment, they might invite that student to have a one-on-one conversation and ask the student to explain their paper. In any case, it is especially important for faculty not to accuse students outright, as that will result in a lack of trust and will cause students to lose confidence and motivation to complete the course.

So what does this mean for AI detection software at this point? It means faculty can’t rely on detectors. Given all of this, it is even more important to design assignments with AI in mind –  by integrating these tools into assignments, faculty can teach students how to use them ethically.

License

Icon for the Creative Commons Attribution-NonCommercial 4.0 International License

Optimizing AI in Higher Education: SUNY FACT² Guide, Second Edition Copyright © by Faculty Advisory Council On Teaching and Technology (FACT²) is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, except where otherwise noted.

Share This Book