PIXEL and Generative Artificial Intelligence

Human-like Chat Experience for Pre-screening Talent

PIXEL with Generative Artificial Intelligence (GenAI) provides a truly conversational and intelligent AI for talent pre-screening. Unlike traditional forms or pseudo-conversations, this platform offers a genuine chat experience powered by advanced AI. The GenAI-powered bots are:

  • Asynchronous: Candidates can apply anytime, even at 3 AM on a Sunday, and receive an immediate bot interview. This ensures 24/7 availability and an additional touchpoint in their journey.
  • Resilient and Adaptive: The AI gracefully handles misspellings, unclear answers, and responds to doubts or additional questions, gathering necessary information without breaking or repeating itself.
  • Safe and Ethical: AI in PIXEL is powered by Claude, the safest, most empathetic and patient option available. It does not have access to candidate PII and will not make employment decisions, acting instead as a fact-finding assistant. Recruiters remain in control, with human-led scoring, ensuring equity and unbiased processes.
  • Multilingual: Leveraging modern LLMs, interviews can be conducted in multiple languages. For example, set up an English interview, and if a candidate responds in Spanish, the bot will switch seamlessly. Supported languages include English, Spanish, French, German, and Portuguese.


Chapters:


Setup and Configuration

  • PIXEL Bot Settings:

    • GenAI Toggle: Enabled by default for new bots but can be turned off to revert to the NLP bot experience. Existing bots can be upgraded to GenAI by toggling this feature on.



    • Confidence: The GenAI bot does not use the confidence setting, so it has no impact on the interview outcome. This change was made due to the low reliability of self-reported confidence scores. 

  • GenAI Consent/Disclosure:

    • Each GenAI interview begins with a message about AI usage and requests consent to comply with multi-state laws. If consent is declined, the interview ends, and a 'Did Not Consent' result is recorded.




  • Multilingual Capabilities:

    • Supported languages include English, Spanish, French, German, Italian, and Portuguese.
    • Additional languages are dependent on Anthropic's support; new languages will be available immediately once supported by Anthropic.
    • Recruiters can set up a bot in English, and talent can take it in any supported language with no extra configuration. The bot automatically adapts to the talent's language.
    • Text configured by recruiters, such as the introduction and closing messages, are also automatically translated.

      Pixel Language.png


Reviewing Interview Results:

  • Interview Transcripts: A new column for transcripts is available only for GenAI bots, retained for one year after interview completion.



  • Required Questions: If a talent does not meet the threshold for a required question, their overall rating is zero. All other scores remain unchanged and can be viewed by clicking on the talent's name.

    pixel fail.png
  •  
  • Consent Decline: If a talent declines AI interview consent, they appear in the interview results tab with 'Did Not Consent' and a question mark for the match. Declining consent is not a failure.

    pixel did not consent.png


Testing Bots:

  • Score Visibility:

    • After completing a GenAI interview test, recruiters can see the score their responses would have received. This is not visible to talent.



How Pixel GenAI Scores:

Scoring Philosophy and Approach

The goal for the new scoring system is to simplify the math from the old system and generalize the scoring functions to be applicable to multiple question types.   

How weight affects the score:  

The maximum score for each Pixel qualifier is 5 points without weights. Multiplying this by the weight for a qualifier gives the total number of points available for that qualifier. Eg. An authorization question with a weight of 5 can earn up to 25 points. A skill question with a weight of 3 can earn up to 15 points, and so on.  

The role of AI: 

While the interview chat itself is powered by Claude, scoring is not. All Claude does is extract the part of the transcript needed to score a question. Let's take a look at just one question to see what this means: 

The bot asks: "How many years of experience do you have working as a software engineer?" 

Talent responds: "I just finished my fourth year last week!" 

Here, Claude extracts the number 4 as the answer to the question, and passes it to the scoring system to actually compute the score. We have concerns about the potential adverse impact of using LLMs as judges so, for scoring, we only use LLMs to retrieve parts of the transcript. 

  

Required/preferred:  

The required/preferred toggle on a qualifier controls the scheduling behaviour of the bots and the overall rating of an interview.  

If talent does not meet the specified threshold for a required question (regardless of how well they did on the rest of the interview), two things happen:   

  1. At the end of a bot interview, the bot does not try to schedule a follow-up interview with a recruiter  
  2. The overall rating for the interview is 0%.   

  

Overall interview score: 

The maximum possible score for an interview is 5 * (sum of all the weights for all the qualifiers). We just add together the weighted qualifier scores for a talent and the final rating is the percentage of the total points the talent got.  

Example: 

Interview has 2 questions: a salary question with weight 4 and a skill question with weight 5. 

The talent scores 16 / 20 for the salary question, and 20 / 25 for the skill qualifier.  

The total number of points available is 5 * 4 + 5 * 5 = 20 + 25 = 45 points. 

Total talent score = 16 + 20 = 36 

Final rating = (36 / 45) * 100 = 80% 

  

Scoring For Different Question Types

Authorization:  

If the talent is authorized to work, they get full points. Otherwise, they get zero.  

  

Education:   

We have a ranking for educational qualifications:  

None: 0  

HS: 1  

Associate's: 2  

Bachelor's: 3  

Master's 4  

Doctoral: 5  

If the talent's educational attainment is at or above the level required by the qualifier, they get full points. If not, then they get zero.  

  

Security clearance:  

Like the Education qualifier, we have a ranking for security clearances:  

None: 0  

Confidential: 1  

Secret: 2  

Top Secret: 3  

TSSCI: 4  

If the talent's clearance is at or above the level required, they get full points. Otherwise they get zero.  

  

Resource:  

We simply check whether talent has uploaded an attachment. If they have, then they get full points, otherwise zero.  

We do not read attachments for scoring. 

  

Commute time:   

Previously, there was a hardcoded 30 min limit for the commute time, and times above this were linearly rated down. Now we leave it up to the talent, simply telling them the estimated commute time from the worksite. If they are satisfied, we are satisfied.  

We define a satisfaction scale:  

VeryDissatisfied: 0,  

Dissatisfied: 1,  

Neutral: 2,  

Satisfied: 3,  

VerySatisfied: 4  

Then we convert the talent's response into a fraction of the maximum score.  

Example: 

Commute time qualifier has a weight of 5. Talent responds 'Dissatisfied' which has a value of 1, so the score for the qualifier would be   

CommuteScore = (zeroToFourScore / 4) * maxScore * weight =  

(1 / 4) * 4 * 5 = 5 points out of a possible 20 

  

Yes or No:  

For a yes question, we give full points if the answer is "Yes", and zero if the answer is "No".  

The no question is the same: "No" gives full points, "Yes" gives zero.  

The bot will try its best to get a simple Yes/No answer but will prefer to score these questions at zero if it cannot get one. It can gracefully deal with answers like 'sometimes' or 'often'. 

  

Salary:   

For salary, we wanted to allow answers slightly over the limit to be scored down but not zero if they are reasonably close to the recruiter's desired range.   

Since recruiters provide a maximum and minimum salary, we can define a salary range: maximum – minimum. We can also define an overage, that is, how much is the talent's requirement over the recruiter's maximum.   

If the talent's requirement is under the maximum, they get full points. If the overage is greater than the range, they get zero points. If the talent is over the maximum, but the overage is less than the range, then they are scored down based on how much bigger the range is than the overage.   

Example: 

Minimum: 100, Maximum: 120, Talent responds: 130, weight: 3  

Range = 120 – 100 = 20 (this is the difference between the recruiter's minimum and maximum salary) 

Overage = 130 – 120 = 10 (this is the amount that the talent's requirement is above the maximum) 

Range is greater than overage, so we do overage / range = 10 / 20 = 0.5.  

Now we score the talent's response:  

SalaryScore = MaxScore * weight * (1 – overage / range)  

                    = 5 * 3 * (1 – 0.5)  

            = 7.5 points out of a possible 15 

  

Start Date:  

Like the salary qualifier, we wanted to allow answers that are reasonably close to the recruiter's required date. This means the scoring is the same (it's the exact same code) as the salary question, with salary numbers replaced with the number of days. If the talent's date is before the required date, they get full points.   

Example: 

Current date is 1/1 and the recruiter wants talent to start within 5 days of completing the interview, ie 1/6 at the latest. Talent responds they can start on 1/7. Weight is 5  

Range: 1/6 - 1/1 = 5 days (this is the amount of days between today and the latest possible start date)  

Overage: 1/7 - 1/6 = 1 day (this is the amount of days that the talent is going to miss the latest possible start date by)  

If the overage is greater than the range (ie if the talent will miss the latest start date by more days than the days between today and the latest possible start date), then talent gets a zero. Otherwise, the formula is:  

StartScore = maxScore * weight * (1 – overage / range)  

         = 5 * 5 * (1 - (1/ 5)) = 25 * (1 – 0.2) = 25 * 0.8 = 20 points  

  

Freeform:  

Due to concerns about the objectivity and fairness of LLMs, we do not score freeform questions. A snippet of the transcript containing the talent's response is displayed in the interview report instead of a score. Complete interview transcripts can also be viewed in the Pixel Interviews page. 

  

Skill:  

For skills, we wanted to reward partial fulfillment of a recruiter's requirement. For example, if a recruiter asks for 3 years and talent has 2 years, we don't want to give the talent a 0.  

Responses at or above the recruiter's requirement are given full points. For scores below the requirement, we compute the score as a fraction of the maximum score.   

Example: 

Required experience: 7 years, Talent responds: 3 years, weight: 4  

SkillScore = maxScore * weight * (talent / required)  

       =  5 * 4 * (3 / 7)   

                     = 60 / 7 = 8.5 points out of a possible 20 



 

 

Articles in this section

Was this article helpful?
0 out of 0 found this helpful
Share

Comments

0 comments

Please sign in to leave a comment.