Number plate recognition – or image recognition – is typically driven by technology known as deep learning. From optical character recognition (OCR) in your scanner, to facial recognition on your smartphone, it has become one of the mainstays of IT and Artificial Intelligence.
While many ‘big tech’ companies are making deep learning through facial recognition API or general image recognition much more accessible, in this post we are focusing on vehicle number plate recognition with no coding. From broadcasting from a smartphone camera, to uploading data from a recognised plate into a cloud database.
Pre-Requisite technology overview for number plate recognition
There are four main tools that we are using for this example:
- OpenALPR offers automatic number plate recognition. In its cloud version it will enable the user to identify the number plate in your video stream
- Talend is used to retrieve the vehicle’s information from OpenALPR, and store it into a database
- BroadcastMe app (an Apple iTunes or Google Play application) allows you to transform your smartphone into a full content delivery network (CDN) platform.
- Snowflake to host a cloud database
The role of OpenALPR
OpenALPR is an open source number plate recognition framework built on top of an OpenCV library. OpenALPR offers a range of free options to try. For this scenario however, we will focus on their Cloud Stream product that enables you to easily retrieve the identified number plate, as well as provide a control platform for the various camera feeds or broadcasts.
The library goes even further by providing the make, model and colour of the car attached to the recognised number plate, plus additional technical detail relating to the video stream.
This platform will automatically save the broadcast and retain an image of the car. It will also let you search for a specific number plate or a car model and colour. The data is accessible via an API.
Talend is a software integration vendor. They specialise in big data, cloud storage, data integration, data management, master data management, data quality, data preparation and enterprise application integration software and services. Analytics8 is a Talend gold partner which means are consultants are expert in utilising Talend technology.
For this plate recognition scenario, only Talend Open Source is required: We can leverage Talend to retrieve all information from the Cloud ALPR API, as well as de-dupe and cleanse the data, before storing the information in a Snowflake database.
BroadcastMe is a video broadcasting application for iOS and Android from Streamaxia enabling Real-Time Messaging Protocol (RTMP) and CDN broadcasting.
For this scenario, the application will be installed on your smartphone and broadcast to a CDN. The CDN will relay the video to OpenALPR agent for number identification. A CDN can help you streaming the video to multiple location easily.
Snowflake is a cloud database / DWH provider. The company has several offerings including a free trial option. For this scenario, Snowflake will be used to store vehicle information identified from ALPR. It will minimise the need of configuration while providing an easy to use DWH/data lake. Talend will also retrieve the data from Snowflake to identify if the car has been seen previously.
Number recognition from your smartphone: Step-by-step guide
1. Create an OpenALPR account
Go to https://cloud.openalpr.com/account/register. The first two weeks are free of charge and allow you to test all functionalities (like the cloud API that will be used in this test).
2. Install the agent
Once your account is created, you need to install an agent on a server that can access a video broadcast (either a webcam that is on the same local network or that can access a live stream broadcast over internet). The installation is described here, and only requires picking up the type of Web Stream service. To be codeless, choose Open ALPR Cloud and supply the email and password used at registration.
For this scenario, the agent and Talend are all installed on a single Microsoft Azure VM.
The agent is a service that controls the OpenALPR library and feeds a queue with recognised number plates. The controller transforms a video stream into a sampled image to be scanned by the API. If something is found, the image is stored and sent to the web agent.
It is possible with additional coding to retrieve the identified plate directly on the beanstalkd queue of the agent. It won’t be described here as the whole idea is to go codeless, or it will require a Talend ESB licensed product. If you want to explore OpenALPR, this documentation can help you.
3. Test the agent
It is recommended to test the agent to make sure it can communicate with the web application.
If you have installed the services in the right order, the web dashboard should pick up your agent. You can then manually configure it by going to Configuration > Agents
It will allow you to select the country to maximise the deep learning algorithm and tell you the state of each related camera.
Before setting up a broadcast from your smartphone, it is recommended to test the agent by feeding a loop video with number plates.
If it is working, you can see a green thumb next to the stream:
4. Setup up Snowflake
The first step is to create a free trial account via snowflake.net
For this test, you only need to try the standard edition and pick up your region (the website recognises your current location and will give you the closest Deployment Region to your current Internet facing server):
The next screen will ask you to fill out your credit card details to verify that you are over 18 and you are a legitimate customer. It will take one day to verify your account and you will be able to start using it for free for one month (or up to certain amount depending on your compute/storage usage).
Create underlying table
Once you have received your credentials, login to the snowflakes admin page to create your first table (It assumes that you already have a TEST_DB created).
To do so click on WORKSHEET and copy paste this script and execute all.
USE TEST_DB; CREATE TABLE IF NOT EXISTS PlateRecognition ( idplaterecognition integer NOT NULL AUTOINCREMENT, plate varchar(10) NOT NULL, make varchar(255) DEFAULT NULL, color varchar(100) DEFAULT NULL, cameraName varchar(100) NOT NULL, best_uuid varchar(255) NOT NULL, scan_start_time varchar(45) DEFAULT NULL, scan_end_time varchar(45) DEFAULT NULL, time_insert datetime NOT NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (idplaterecognition) ) CLUSTER BY (plate, best_uuid, cameraName) COMMENT='Plate Recognition Test';
You can see a good execution in this image:
Verify the table
After clicking on DATABASES you can see the table in the list and the structure:
5. Setup Talend
With a fake stream, the agent will start putting number plates into the queue. This is where Talend can call the cloud API, extract information from the JSON file and output it to a database.
This article will not describe how to install Talend Open Studio, however you can however find further information here on the official Talend website.
With a Talend Studio locally installed on your machine, launch the studio . It will prompt you to create a new project:
- Click on “create a new project”
- Type Plate_Recognition on the box and create
- The project will then appear in the project list. Select the project and click Finish
- Studio will open with a welcome page, close it.
- Create a new job by right clicking on Job Designs / Standard in the metadata panel (left by default)
- Choose a meaningful name, purpose and description (example below) and press Finish
- Retrieve plate from ALPR CLOUD API to Snowflake DB
- Connect to Cloud API and retrieve the pull of recognised plate
Talend Job variables
The first step after creating the job is to create a series of variables. In Talend, it is called context variable.
This is our preferred two options:
- In-job context variable. Predominately used for job only like debugging.
- Go into the Context tab of the job
- Click on the green plus, it will create a new line
- Choose the type of variable from the drop-down. For this exercise, you can choose String
- The list of variables is:
- PubKey – found on the Cloud Api page
- Reusable group variable. It is a good option for cross jobs and multiple environment variables.
- Right click on Contexts from the repository list and choose Create a context group
- Put a significant name, purpose and description in the new window
- The rest of the steps are similar as the 1b, c and d
- Click finished once you have created your 3 variables
- To import your group context variables, go into the Context tab of the job and click on the importer button
- Select then your context group from the list and click Ok, it will import your variable into your job
It is faster to manage the Snowflake connection through the metadata panel. Talend provides a tutorial for this as well, and the recommendation is to create a context group out of it.
Talend Job component
With the variables ready, it is time to create the backbone job. Please watch those two tutorial videos on Talend’s Youtube channel, it will help you selecting your components:
The job is getting from OpenALPR API the list of recognised number plates. The JSON list is parsed and deduped. Talend can performed other processing like marked if it is a new plate or a previously seen plate, registered the record on database and so on.
Connection to Snowflake
To minimise calling the cloud database, you can cache Snowflake data into an tHashOutput as below. Drag and drop your Snowflake metadata to the canvas and choose tSnowflakeConnection. Repeat this step to drag and drop a tSnowflakeInput and connect it to a tLog.
The first Input is a hack to start the Warehouse on demand:
Alter Warehouse LOAD_WH resume
Repeat the process to retrieve the data via a tHashInput (“Plate Cached” component from Figure 1) directly from memory. The SQL query is:
SELECT DISTINCT plate, MAX(best_uuid) as best_uuid FROM PlateRecognition GROUP BY plate
Once you have setup your context variable in Talend, you can easily configure your Get request formatted (context.host+”company_id=”+context.CompanyID+”&start=”+globalMap.get(“dateTime”)):
The request contains three parts:
- Host: Since we chose the free option context, host is an alias of https://cloud.openalpr.com/api/search/group?
- Company ID is your unique ID as described in the setup, to ensure only you can access the JSON result
- Start time enables us to only retrieve the last X plates from the queue. It works with an infinite loop set to every 1000 millisecond (every second), and the dateTime contains a String of data in this format yyyy-MM-dd’T’hh:mm:ss.000’Z’
Parsing the JSON
The first stage defined the root and loop “$..fields.[*]” then perform the actual mapping. The JSON stream also contains more information than those mapped below.
Store the number plate
The tMap component gives you the opportunity to have multiple outputs. In this example, there are three:
- tLog to print the plate on the screen
- tSnowflakeOutput (drag and drop from metadata) to store data into Snowflakes
- tHashOutput update the cache plate with the new discovered number plates
The design of the job allows continuous plate integration. So, you can start the job by clicking on the Run tab.
You will see the flow going from the API call to the various link.
We also recommend adding a tPostJob at the end of the flow to stop the SF warehouse automatically. The SQL is “ALTER WAREHOUSE LOAD_WH SUSPEND”.
If you are looking into a corporate environment, your video acquisition stream will come within the boundaries of your network (or VPN). OpenALPR supports a direct stream from an Axis camera or MP4 video.
This case study is based on using your smartphone, and it comes with a limitation: Your mobile is on your carrier network meaning it can’t access directly (unless you are willing to rootkit it) the OpenALPR agent. After trying a few mobile applications, BroadcaseMe was the easiest as it comes with a free CDN (Content Delivery Network) / RTMP protocol, it is compatible with Android and Apple. The CDN is public facing meaning you can process it with the plate recognition engine and do a security broadcast at the same time
There is a good tutorial on Vimeo here on how to setup the CDN. The key part is to retrieve the rtmp link. It looks like that http://rtmp.streamaxia.com:1935/streamaxia/XXXXXXXXXXXXX/playlist.m3u8 and then establish the feed on the OpenAlpr Agent.
Nowadays, there are a lot of free technologies available to realise a POC. Number plate recognition and deep learning is easily accessible. The integration part can be complicated, however by leveraging both Talend and Snowflake it can be straightforward. While overcoming network trouble is effortless with BroadcastMe.
Cookbook prepared by Adrien Follin – Talend practice lead. Adrien has over a decade of experience in IT with Data and Integration and general Business Intelligence implementation. He is passionate about simplifying the impossible and bringing significant business value with his data and analytics solutions.
Analytics8 is technology agnostic advisor and implementation partner, not a software reseller. We pride ourselves on being able to provide businesses with genuine, unbiased advice based on our deep experience in data warehousing, business intelligence and advanced analytics. We are passionate about transforming data into usable information and actionable insight and have the skills and experience to deliver projects irrespective of an organisation’s current technology mix.