SK Train Ticket

Android application for storing and showing of Slovak Rail tickets.

SK Train Ticket

Where?

Available on Play Store.

Used Technologies

  • Java
  • Kotlin

Implementation Details and Challenges

Data from PDF

The ticket can be downloaded in the PDF format. When such ticket arrives to my application, I need to extract some data. In addition to the QR code, I needed some texts from the ticket, too (e.g. for displaying of the ticket list, sorting, duplicates resolving). Firstly I thought about using some third-party library for reading of PDF files. I somehow ended up with my custom solution, which for each page in the PDF file extracted a list of objects consisting of:

  • position on page (X and Y)
  • text fragment

Looking at the tickets, I knew where approximately which information is positioned, and additionally where some constant texts (labels) are positioned. Mainly the relative positions of the text fragments to the constant fragments helped me to find out which text fragment means what:

  • the From station
  • the To station
  • the validity begin date and time
  • the validity end date and time
  • the ticket holder’s name

The extraction of the texts included some small research and a little bit of coding:

  • the texts in PDF are stored in some gzipped blobs
  • after extracting the text fragments, I hoped they will be encoded using some common encoding (e.g. UTF-8), but they were not - in some other gzipped blob I found out the encoding table which had to be applied

On the other hand, the QR code (JPG in PDF) extracting was easy. So easy, there is nothing to tell about it.

Data from PNG

Similarly as for PDF, the ticket could come in PNG format and I had to process it. The QR code had constant position on the image - easy.

What about the texts? Instead of using some third-party OCR library, I made my simple OCR (the solution is so simple I am not even sure if it can be called OCR).

The first step was to discover which font is being used to draw the texts to the ticket PNG images. It was an easy guess - it was an sans-serif font so I looked at Arial and it was it.

Obtaining the font name was not enough - when drawing the letters to the image, there can be different rendering options applied (anti-aliasing, kerning, maybe more).

I did not even know which library and programming language was used to create the tickets. It turned out to be Java. Interesting thing: even different Java implementations (Java vs. OpenJDK at that time) render (calculate) the pixels at the edges of the letters differently.

With enough information, I made a simple Java program which draws the letters using the same font, size, kerning options and rendering options. I ran it under the correct Java implementation and obtained the samples of the letters rendered exactly the same way as on the ticket.

Now I had only to compare the pixels exactly with the sample to find out which letter it is. No, not so exactly. When letters were too close, because of the anti-aliasing they sometimes shared pixels (e.g. one pixel had to belong to the letter currently being resolved, but the color of this pixel was a little bit darker, because the adjacent letter was adding itself to that pixel, too). So, the comparison had to be more forgiving when it encountered some differences.

Even such simple mechanism turned out to be a little bit slow. I thought about re-implementing it in C/C++ and include it as a native library. At the end, only some optimizations in Java code helped and the speed was sufficient. The main optimization was reusing already allocated byte buffers (even with larger size) throughout the process instead of allocating new byte buffers (of the exact size) every time when needed.

Downloading from Web

The application can download the tickets directly from the Slovak Rail web. The implementation of this functionality has undergone several complete re-implementations:

  • Apache HTTP client. I “browsed” the web by issuing correct requests, including correct handling of the login form and cookies.
  • HttpURLConnection. Because Apache HTTP client was being deprecated and later removed from default Android APIs.
  • Hidden WebView browsing. It was increasingly difficult to follow changes on the website. Clicking in the WebView is more general and less prone to be broken because of some simple change in the communication on the web. It even automatically runs JavaScript if it is needed somewhere in the browsing flow, of course together with automatic handling of cookies and redirects.

Downloading from E-Mail

After purchase, the tickets are being sent to an e-mail address, so I added an option to download them from the e-mail inbox, too. I used javamail-android library (I point to an old URL which I used at that time).

All the requirements (POP3, IMAP, SSL support, no unnecessary downloading of already obtained messages) and weird behavior of some mailboxes (e.g. automatic moving of downloaded messages from inbox to archive) resulted to quite complex UI with several options and warnings:

Mail settings

Very, Very Old Android Versions Compatibility

Some time ago (from GIT history I see it was around March 2017) I thought it was somehow important to support very old Android versions, too.

I ended up with an application which has:

  • minSdkVersion = 4 (yes, this is Android 1.6)
  • targetSdkVersion = 24 (Android 7)

The app probably really worked on that old Android (I tried it on the emulator). At the same time, the app included and supported latest API 24 features and requirements (mainly exposing files using FileProvider).

Later, it became difficult (or even impossible) to combine minSdkVersion, targetSdkVersion and versions of Android support libraries to achieve such range of supported versions. And I finally understood it is not worth it.

Application Usage Statistics

Although it is a non-official application usable only in Slovakia, quite a lot of people found and installed it.

The application was published around April 2014.

According to statistics on Google Play, the “Active devices” count peaked around October 2018 showing around 820 active devices.

The average rating as of 2019-11-29 is 4.4 (83 ratings).