Frequently Asked Questions

CV/Resume Parser🔗︎

What is a resume/CV parser?🔗︎

A resume parser (CV parser) is used within human resource software and on recruitment websites, job boards and candidate application portals to simplify and accelerate the application process. It does so by extracting and classifying thousands of attributes about the candidate. A resume parser also provides a foundation for the semantic searching of candidate data because the parser classifies data as to its meaning, and assigns values to each data point such as time or recency. The parser identifies hundreds of different kinds of information within a resume or CV and clearly tags each data point (for example: first name, last name, street address, city, educational degrees, employers, skills, etc.). Textkernel has been the industry innovator in building this technology for over two decades.

How do I use the CV/Resume Parser?🔗︎

The Resume Parser is available via our SaaS service's REST API. Using our public SaaS service requires a developer to integrate to a single API endpoint. To learn more about how to make this API call, refer to our documentation.

How are the results structured?🔗︎

The REST API returns results in JSON format following the schema found in our documentation. We do not support excel files or csv output.

Can I try the Textkernel CV/Resume Parser for free?🔗︎

Absolutely! You can sign up for a free trial and test with our UI or via API. This gives you access to test through our human demo interface, or you can make API calls to our SaaS REST API using the supplied credentials.

What resume formats can the Textkernel CV/Resume Parser process?🔗︎

Essentially any resume and CV format, including all of the popular job board formats and social and professional networks. Compression file formats are not supported.

Does the CV/Resume Parser store my resumes?🔗︎

No, the CV/Resume Parser does not store any resumes. All parsing is done in-memory so there is never any data written to a file system or database.

Does the Textkernel parser add any intelligence to the parsed resume / job?🔗︎

Yes, the parser adds several enrichments:

Candidate Summary: A summary of who the candidate is today, a management summary, average time at each employer, etc.
Profession Normalization: extracted job titles are classified according to our taxonomy, plus mappings to the international O*NET and ISCO standards
Skills Normalization: extracted skills are classified according to our taxonomy, and connected to related profession groups via our ontology

How do I process batch transactions?🔗︎

Routine parsing of batches of less than 100,000 documents MUST always be done serially (one-at-a-time). To read more about batches and concurrency, see here.

Why do I need to specify a Document Last Modified Date (formerly Revision Date)?🔗︎

Document Last Modified Dates tell the parser when the document was last revised, which impacts the interpretation of terms such as 'current'. If you do not provide this date, then the latest experience on old resumes will be interpreted as current. Please make sure to get this right from the start, or you risk having to reparse your documents. For more information, go here.

Search & Match🔗︎

What is the difference between search, match, and bimetric scoring?🔗︎

Matching is a fully automated process. You give the engine a document (a job or a resume) and tell it to bring you back the best matches (jobs or resumes). The engine determines the relevant criteria and returns the best candidates. Matching allows for humans to tell the engine what types of data are important using category weights while still letting Textkernel do the heavy lifting of generating the queries and scoring the results.

Searching uses a human-specified query. The recruiter tells the engine exactly what she wants, and the engine brings back documents that match those specific criteria.

Bimetric scoring is both a feature and a product. Bimetric scoring as a feature is the process by which our matching engine looks at two directions of fit, such as (1) how well does the candidate fit the job? (2) how well does the job fit the candidate? Bimetric scoring as a product is used to score a user-determined set of documents. For example, using the Bimetric Scoring product, you can see exactly how a set of 50 applicants for a job score against that job.

Does the Textkernel matching engine learn from user interaction?🔗︎

No, because that is a terrible practice! Our engine doesn't learn based on your recruiter's clicks because there is no way to know why the user clicked on a profile to review, or what they really thought about what they saw. Systems that "learn" from recruiter clicks quickly go off the rails as the system "learns" from poor-performers and makes invalid assumptions about intentions. Our algorithms are curated by our team of experts and are explainable down to each detail.

How is PII protected and ignored in your algorithm?🔗︎

Parsed documents that have been sent to the API to be indexed are scrubbed of ALL personal data and personally identifiable information (PII) and then stored in the specified index. This security measure also assists in removing bias from the recruiting process. If you want to simply search for a candidate by name, use your own database.

Can I index custom data not found on a resume?🔗︎

Yes. You can index a wide variety of data that you define and that you use in narrowing or widening result sets.

How can I test Search & Match?🔗︎

You can test bimetric scoring, the same algorithm used in matching, on our demo site. To test matching at scale, reach out to sales@textkernel.com to discuss setting up a test environment using real data.

Skills Intelligence🔗︎

How does Textkernel define a skill?🔗︎

Textkernel defines a skill as a trait or capacity that could be associated with an individual person in a variety of professional situations or context.

How many skills and professions does the Textkernel Taxonomy cover?🔗︎

We distinguish more than 13,000 unique skills, encompassing over 250,000 skill synonyms spanning 20+ languages. For professions, we distinguish more than 4500 unique professions, encompassing over 150,000 job title synonyms across 10 languages. Each unique skill and profession has a code, and those codes are consistent across languages.

What process did Textkernel use to collect, assess and classify all concepts and synonyms?🔗︎

We began by taking a wide view on several sources of skill information including CVs/resumes, job vacancies, numerous existing taxonomies, online sources such as wikipedia, training and development descriptions and syllabuses from various educational institutions.

We then used statistical techniques and machine learning and distilled our list into the most recurring concepts and created a hierarchy of terminology. We assessed relevancy by analyzing actual occurrence of skills across millions of job vacancies. Finally, our dedicated quality assurance teams extensively inspects and improves the system to guarantee that the skill extraction reflects human intuition.

How are Textkernel's taxonomies structured?🔗︎

For skills, we distinguish four categories of skills: “Professional skills”, “Soft skills”, “IT skills” and “Languages”. For professions, we have a hierarchy of profession categories (e.g. “Healthcare”) , profession groups (e.g. “Physicians”) and the professions (e.g. General Practitioner”). Our Ontology connects the professions and skills.

How frequently does Textkernel update the taxonomy?🔗︎

Through our data-driven discovery process and incorporation of customer feedback we ensure that new skills, job titles and their synonyms are regularly added to the taxonomy. We generally release updates to our internal synonym lists every two weeks. Updates to the taxonomy structure - such as adding completely new skills - are grouped into a release once every 3 months to allow you to adopt the new version in your own data and processes.

Can I suggest new skills or job titles?🔗︎

We welcome feedback on our taxonomy. Please send your suggestions to support@textkernel.com. We also consult closely with several of our key customers on the continued enhancement of our taxonomy. If you are interested in participating in our Premium Taxonomy Maintenance program, please contact us for more details.

Pricing🔗︎

What is the cost to parse a resume/CV?🔗︎

All Tx Platform API transactions consume credits that are purchased in advance. For more information on transactions costs, refer to our documentation.

What happens if I parse more than the amount of credits purchased?🔗︎

Textkernel provides several ways to prevent you from unintentionally running out of credits.

The best way to ensure that you always have sufficient credits is to purchase a monthly subscription. If you use all your credits before the month ends, a new purchase will be automatically initiated, with no intervention needed from you.

There are other safeguards built in: * In every API response, we provide how many credits are remaining. You should develop an alert to notify someone when your credits reach a certain threshold. * Textkernel will notify you when your credits remaining drop to 15% of your annual usage. (This notification is only sent out once per day.) * Textkernel will notify you when your credits reach 0. (This notification is only sent out once per day.)

Account Management🔗︎

How do I know how many credits I have available in my SaaS account?🔗︎

You can check your remaining SaaS credit balance by looking at the CreditsRemaining field returned with each API transaction. Also, you can track usage for your account in the Tx Console.

Will creating a sub-account change the main account?🔗︎

No, you will not see any changes to the existing account, indexes, or documents.

Will each sub-account have their own AccountId and ServiceKey?🔗︎

Each sub-account will have its own AccountId and ServiceKey. Because you are providing the geocoding credentials, it is up to you if you want to use the same one or set up different credentials with your provider.

Can the parent account view sub-account Search & Match data (or vice-versa)?🔗︎

No. All accounts should be treated as independent accounts that share credits. They cannot view each other's indexes or documents.

Does the Maximum Allowable Number of Concurrent Batch Transactions in your Acceptable Use Policy applicable per sub-account or does it apply to the traffic across all sub-accounts combined?🔗︎

That amount is across all sub-accounts combined.

LLM Parser🔗︎

When should I use Textkernel's LLM Parser over Textkernel's standard parser?🔗︎

Textkernel's LLM Parser demonstrates the advanced capabilities of our technology. You can assess whether it's suitable for your use cases or if our standard parser remains the better choice.

Our standard parser is known for its speed and impressive accuracy, achieving over 95% accuracy for the most critical data points. Textkernel's LLM Parser elevates accuracy even further, reducing the remaining errors by up to 30%. However, it's important to note that it currently requires more time for parsing and comes at a higher cost.

Do I need to make changes to API integration or field mappings?🔗︎

If you're already using the Tx Platform, there's no need for modifications. The LLM engine utilizes the same API models and endpoints as the existing Tx Platform. You can enable Textkernel's LLM Parser with a single boolean flag on the request.

Is my data safe when using Textkernel's LLM Parser?🔗︎

Yes, Textkernel's LLM Parser currently leverages LLMs via AWS Bedrock. Bedrock provides a safe and secure implementation of Generative AI all within the AWS network. Model inputs and outputs are not stored, not used to train future models by Textkernel or AWS, and never shared with any additional outside party.

Does Textkernel's LLM Parser support languages other than English?🔗︎

Yes, the LLM Parser supports the same set of languages as the other Textkernel parsing engines.

Does it work with Search & Match?🔗︎

If you're using the Tx Platform, our LLM Parser will produce the same data model that Textkernel Search & Match uses, making it compatible. Please reach out to support@textkernel.com for tailored advice regarding your setup.

Why is Textkernel's LLM Parser better than building a Parser using OpenAI directly?🔗︎

Textkernel manages the entire parsing process, including document conversion to text and HTML, as well as handling column layouts. Moreover, we enhance the output with our data-driven skills and profession taxonomies. By choosing Textkernel's LLM Parser, you eliminate the need to spend any time selecting, evaluating and tuning the LLM technologies and ensures you benefit from future enhancements and features.

How do you deal with hallucinations?🔗︎

We employ several strategies in our prompts to minimize the impact of hallucinations. In our evaluations of the parsing accuracy, hallucination issues have rarely been observed. Overall, Textkernel's LLM Parser offers improved accuracy compared to the standard parser, helping to offset any potential hallucinations.