Overview
Using the Job Parser🔗︎
To use the Job Parser, simply pass in your job document through our API and the parser will extract multiple data points such as employer name, job location, job description, required skills, required degree, start date, etc. (for a full list of fields, please refer to the Parse a Job API Reference. In addition, the job parser can also output the original document in a variety of formats including HTML, PDF and RTF.
Job Parser Output🔗︎
To see an example of the output, parse one of your job descriptions in the api. We don't provide a sample output because your settings and the source job can impact the fields returned in the API response. For a full list of fields, please refer to the Job Schema.
Danger
IMPORTANT: Your integration code should be robustly written to handle the case of missing elements. Most of the elements are optional; they are output only if:
1) The data exists in the parsed document, and 2) The parser is able to recognize the data, and 3) Configuration options have enabled parsing of those elements.
Optical Character Recognition🔗︎
Optical Character Recognition (OCR) ensures that scanned or photographed documents are automatically detected and converted into text, so that the text can be parsed. Without OCR, such documents cannot be parsed. On average, approximately 5% of documents require OCR.
OCR does not impact the response time for non-image documents. However, for documents that do require OCR, the response time is higher due to the computational intensity involved with converting images to text. Because of this computational intensity, we also limit OCR to 10 pages and will stop processing the document after 120 seconds.
When OCR is enabled, a small additional transaction cost is incurred with each parsing transaction. Scans and full image documents are then auto-detected and OCR is applied when necessary. OCR can be configured in the Tx Console.