Just got level2 and now I can post and share my workflows 🎉
Last month I wrapped my time at a startup and started looking for opportunities. The best way is internal-referral for sure, but checking open positions in LinkedIn or company websites can help you apply more positions and get more onsite/remote interviews to strength your interview muscle 💪
The goal is to get email/slack/text notification when a certain company is hiring for a certain role or at certain location. This doesn't need to be real-time. Daily check is good enough, since there won't be so many new gigs posted every day😂
I will share my sample workflow in the end, but you almost have to customize it so please read on.
Let's start from a simple case, you want to find out whether there is any new position for Amazon or AWS in Vancouver (yes, I live here 🇨🇦).
You go to amazon.com and find their career site https://amazon.jobs/ Click the link for "Software Development" and landed https://amazon.jobs/content/en/job-categories/software-development I would suggest before working on the automation, do this manually a few times to make sure this worths the automation effort. Choose a few filters, such as Canada, BC, Vancouver, Full time, then you got 38. I am looking for manager roles, so I typed "Manager" in search bar and got a bit less. Sometime it's a single page, sometimes there can be multiple pages for the positions. But you don't have to automate the pagination or parse the DOM. If you have a little bit JavaScript background, you can try to figure out whether there is data feed behind the page. So open the JavaScript Console, go to Nextwork tab, filter by Fetch/XHR, reload the page, and guess which request contains the job list. Usually with 10KB or more response size. Then check the request details, is that GET or POST, what's the URL, what's the payload, and also whether by default the page size is 10, you probably can change it to 100 to get first 100 times without pagination.
Now in n8n, start by a schedule trigger, followed by a HTTP Request, put the URL address in, make sure you choose the right method GET or POST, matching the request. If your configuration is right, test this step and you should get a valid JSON response. Most job opening site doesn't require login for such list API (yes, if you login, you might get some preference saved)
Next step is to parse and job list. For the case of amazon.jobs, the HTTP request gives you a big JSON. You need to review that JSON output (either in n8n or in a text editor) and locate where the array is saved. In this case, "jobs" is the top level key in the JSON for the array. Sometimes, this can be nested deeply.
Add a Split Out node, and drag the jobs from Schema tree to the form. Now I got a clean table with multiple rows and multiple columns.
Add a Filter node if necessary. You usually need this. Even I put Vancouver in the data feed URL, but the job list still contains non-Vancouver. Maybe bugs, maybe one position has multiple cities and each city is a row. Anyway, you can add filter json.city is equal to Vancouver, and turn 14 rows to 11 rows(3 discarded/filtered). Some other website, such as databricks, you even need to add multiple filter nodes, if the jobs list are put under different department and locations. You only want to know what NEW jobs are posted, so you need a memory, meaning you'd better save the current job list somewhere, so next time your workflow can get a new list of jobs and compare which job is not saved previously.
The easiest way is to use a Google Spreadsheet. You can view and edit the list easily. Reading/writing Google Spreadsheet in n8n is not a very easy task but still way easier comparing to writing a set of Java/Python code. You need to create a GCP project (free) and enable the SpreadSheet API, create an App and create client id/secret. Also add your email to the test app list, if your app is not verified. N8n provides quite okay docs and video about this.
In my case, I created a new spreadsheet in my personal Google Drive, call "positions". Then create a sheet for each company. So there is a sheet for amazon. Create a single column called ID. Why? since you don't need to have multiple columns in the google spreadsheet. You just need to track a set of job ID to know which is known and which one is new.
Add a new Google SpreadSheet node, use a credential for your client id/security, click to browse and locate the position file and amazon sheet. Read all rows.
Add a Compare Datasets node, to compare the list from HTTP request and the list from your google spreadsheet. Make sure you connect those 2 nodes directly to the "Compare Dataset" node (I didn't do this and cannot reason about the result). Compare the ID from 2 lists. Yes, you need to guess/figure out whats the ID in the JSON data, for amazon, it's id_icims. I have no idea what this means but who cares, as long as it's unique. Compare node will create 4 outputs. Choose the one only show up in the HTTP request, not in the spreadsheet. Those are the new job opening and you need to send notification.
What's the best way to notify yourself? I choose to send me email at 7am daily, if there is any new opening I'm interested. Again, create a n8n credential for gmail with permission to send email (no read permission is okay). In the gmail node, set To address to your own email, Subject can be parameterized to include the company name and position, e.g. "alert: new open position for {{ $json.company_name }}: {{ $json.title }}". The message template: {{ $json.title }} <br/>
Last Update: {{ $json.posted_date }} <br/>
{{ $json.description }} <br/>
After the mail is sent (if the workflow finds 2 new jobs, then 2 emails will be sent, not 1), you need to save the new job ID to the spreadsheet. Add a "Append row in sheet" google sheet node, and use "{{ $('Compare Datasets').item.json.id_icims }}" for the Value to send. Wow, this post is much longer than I expected. Maybe I should record a video to make things look easier. But hey, the reality is nothing is easy. You need to figure out which jobs you are interests, figure out whether you can get a data feed without login and nextPage, or have to parse the HTML DOM, or even have to extract content based on screenshot.
As I said, this is my first time to publicly share a workflow, so I copied the nodes, put in a text editor and masked a few IDs/url/emails. Maybe it doesn't work, which is fine. By following my above steps, you can build your own workflow. This one is still relatively simple but could be helpful if you want to send many resumes out, or apply for certain positions once the jobs are open. Good luck and feel free to ask my anything in the comments below.