I had a WRX STI for a few years. Super fun car. Then the pandemic hit, cars were difficult to find. I wasn't using a car and my family really only had use for one car. So I sold the Subaru. Then a few months later we needed two cars again, and quickly. I knew I wanted an electric car. So I looked at Teslas, and Leafs, and whatnot, and finally at the Toyota Mirai, remembering earlier in the year hearing about some absolutely amazing deal. The deal was still happening, you could get the car out the door for $15k, and then they'd give you $15k in free fuel for it.
Obviously there was some kind of catch, and it is the fueling situation. Doing a bit of research I learned all about the locations of the fuel stations around the bay area, and a little bit about their reliability. I learned about a recent H2 shortage as a result of a major plant accident. Still, even after all that I decided to jump in. The truth is I have been giddy about this car for as long as I have known about it. I used to try to spot them driving around the bay area, and when I found one, I might follow it until it dispensed water. They are really cool cars. Electricity from hydrogen, totally wild!
Now, a few months in and it is going about as well as I expected. I have had some close calls with fueling, letting the tank run low, really risking having to be towed. The thing I failed to completely anticipate was how reliable the station status reporting system actually is. I have learned a station's status is not a fully automated measurement. And why would it be? There are only so many ways to detect automatically that a station is down, and many more modes of failure. This means there is a bit of room for predicting when a station will be marked as inoperable. At the very least seeing longer term statistics helps me to decide if I should head to a station, wait another day, or go to a different station, perhaps less convenient.
There are two sites for monitoring California's hydrogen station status, https://m.cafcp.org, and https://h2-ca.com/ . The former is provided by the California Fuel Cell Partnership, and is the canonical source for live station status data. The latter is a website created by an avid H2 driver, Doug Dumitru. It provides a limited-term (90 day statistics) view of regional data and seven days of history for each individual station. Some station statistics are also provided, including historical %-uptime. Doug's site has proven to be extremely helpful, and has certainly played a role in helping me make good choices when picking stations.
However, I want longer term trends and statistics. I also wanted to develop my own predictive algorithms to give an estimated probability for the station still being online when I get there.
My approach began with what I knew. Scrapy was how I started. After inspecting the CAFCP website and spending some time attempting to parse the page I learned there was a background XHR object being digested by the page and used to dynamically update the page. It was breaking my casual attempt at scraping, so instead of scraping the page, I tried using Python to simply get the json being sent in the XHR. It worked!
Data in hand, it was time to select a database. MySQL seemed like an obvious choice, but consulting with a coworker I started looking into schema-less options. InfluxDB was something I had heard of, so I checked it out. After a few circles I had my NAS running an InfluxDB container. In skimming the documentation I realized the Influx Data platform had a lot to offer, including a system for ingesting json data. Tinkering and futzing with the json data pipeline, I eventually came to the conclusion the json from the CAFCP source was all kinds of weird, and the poor json plugin for Influx is not powerful enough to parse it. Defeated I started to resign myself to writing Python for a parser after all.
Once again in the docs, I came across something called Starlark. Like IEC-61131-3 ST is to C, Starlark is to Python. Specifically made to parse weird stuff, or so it seems, Starlark was the answer. An hour or so to learn the basics and I had data flowing into my InfluxDB container, from a Telegraf container running Starlark. This was some kind of magic.
At this point I had been exposed to Docker, Grafana, InfluxDB, Telegraf, Starlark... so many tools, and much credit to the software design choices. They all worked together and quite nicely. It really was a testament to software engineering philosophies and principles of our day. The problems we used to have in doing stuff with computers have certainly changed, with the old problems being dare I say, solved...
Anyways, now with InfluxDB filled with data, I was on to Grafana to build some dashboards. Here is the main one I made after becoming extra familiar with Grafana variables, Flux query language and all the overrides I ever hoped for.
With my basic single station view more or less sketched out I can now prepare to develop my prediction algorithms. If I was really smart, I might try to throw some machine learning at it. I had a lot of fun with this endeavor into metrics and data platforms and as usual learned a bunch.