Formula 1 data–staying up to date

Posted by

Just an addenda to what I posted last week about downloading the Formula 1 motor racing data where I provided a script that would

  • download the ANSI SQL compressed script
  • decompress it and convert the ANSI SQL population script to an Oracle version
  • drop the existing tables in your schema
  • build and populate new versions of tables

This is all well and good for exploring but what if you have built some research or educational apps on top of the data and you’d rather not blow the data away every time you wanted to update the data to show the current results.

There is a full set of REST APIs available on the site to achieve this, but to achieve a generic “What has changed?” (eg drivers, circuit details, constructors etc et) I figured I would end up smashing the site with lots of REST calls and the last thing anyone wants to be is that person that is killing a site due to load.

So I’ve a taken a different approach to refresh the data without dropping/recreating. In my github repos not you’ll see a few new scripts

  • f1_delta.sh
  • f1_delta.sed
  • f1_delta_wrapper.sql

Like the original script, I’ll download the full ANSI population script, but this time populate a series of tables prefixed with “F1DELT” (for delta). Then using a little SQL magic, we interrogate the data dictionary to dynamically generate a series of PL/SQL anonymous blocks which contain MERGE statements to refresh the pre-existing F1-prefixed tables with any new data in the F1DELT-prefixed tables. I deliberately generate two MERGE statements to separate the UPDATE phase from the INSERT phase, mainly so that its easier to digest changes from new data. For example, for the race results, the anonymous block that is generated looks like this


declare
  l_rows                          pls_integer;
begin
  merge into F1_RESULTS w
  using ( select
             RESULTID
            ,RACEID
            ,DRIVERID
            ,CONSTRUCTORID
            ,ORDINAL
            ,GRID
            ,POSITION
            ,POSITIONTEXT
            ,POSITIONORDER
            ,POINTS
            ,LAPS
            ,TIME
            ,MILLISECONDS
            ,FASTESTLAP
            ,RANK
            ,FASTESTLAPTIME
            ,FASTESTLAPSPEED
            ,STATUSID
          from F1DELT_RESULTS ) p
   on (  p.RESULTID                       = w.RESULTID
  )
  when matched then
  update set
      w.RACEID                         = p.RACEID
  ,   w.DRIVERID                       = p.DRIVERID
  ,   w.CONSTRUCTORID                  = p.CONSTRUCTORID
  ,   w.ORDINAL                        = p.ORDINAL
  ,   w.GRID                           = p.GRID
  ,   w.POSITION                       = p.POSITION
  ,   w.POSITIONTEXT                   = p.POSITIONTEXT
  ,   w.POSITIONORDER                  = p.POSITIONORDER
  ,   w.POINTS                         = p.POINTS
  ,   w.LAPS                           = p.LAPS
  ,   w.TIME                           = p.TIME
  ,   w.MILLISECONDS                   = p.MILLISECONDS
  ,   w.FASTESTLAP                     = p.FASTESTLAP
  ,   w.RANK                           = p.RANK
  ,   w.FASTESTLAPTIME                 = p.FASTESTLAPTIME
  ,   w.FASTESTLAPSPEED                = p.FASTESTLAPSPEED
  ,   w.STATUSID                       = p.STATUSID
  where     decode(w.RACEID                         ,p.RACEID,0,1) = 1
     or     decode(w.DRIVERID                       ,p.DRIVERID,0,1) = 1
     or     decode(w.CONSTRUCTORID                  ,p.CONSTRUCTORID,0,1) = 1
     or     decode(w.ORDINAL                        ,p.ORDINAL,0,1) = 1
     or     decode(w.GRID                           ,p.GRID,0,1) = 1
     or     decode(w.POSITION                       ,p.POSITION,0,1) = 1
     or     decode(w.POSITIONTEXT                   ,p.POSITIONTEXT,0,1) = 1
     or     decode(w.POSITIONORDER                  ,p.POSITIONORDER,0,1) = 1
     or     decode(w.POINTS                         ,p.POINTS,0,1) = 1
     or     decode(w.LAPS                           ,p.LAPS,0,1) = 1
     or     decode(w.TIME                           ,p.TIME,0,1) = 1
     or     decode(w.MILLISECONDS                   ,p.MILLISECONDS,0,1) = 1
     or     decode(w.FASTESTLAP                     ,p.FASTESTLAP,0,1) = 1
     or     decode(w.RANK                           ,p.RANK,0,1) = 1
     or     decode(w.FASTESTLAPTIME                 ,p.FASTESTLAPTIME,0,1) = 1
     or     decode(w.FASTESTLAPSPEED                ,p.FASTESTLAPSPEED,0,1) = 1
     or     decode(w.STATUSID                       ,p.STATUSID,0,1) = 1;

  l_rows := sql%rowcount;
  dbms_output.put_line('F1_RESULTS                    Merge phase 1 complete, updated rows='||l_rows);

  merge into F1_RESULTS w
  using ( select
             RESULTID
            ,RACEID
            ,DRIVERID
            ,CONSTRUCTORID
            ,ORDINAL
            ,GRID
            ,POSITION
            ,POSITIONTEXT
            ,POSITIONORDER
            ,POINTS
            ,LAPS
            ,TIME
            ,MILLISECONDS
            ,FASTESTLAP
            ,RANK
            ,FASTESTLAPTIME
            ,FASTESTLAPSPEED
            ,STATUSID
          from F1DELT_RESULTS ) p
   on (    p.RESULTID                       = w.RESULTID
  )
  when not matched then
  insert
    (   w.RESULTID
    ,   w.RACEID
    ,   w.DRIVERID
    ,   w.CONSTRUCTORID
    ,   w.ORDINAL
    ,   w.GRID
    ,   w.POSITION
    ,   w.POSITIONTEXT
    ,   w.POSITIONORDER
    ,   w.POINTS
    ,   w.LAPS
    ,   w.TIME
    ,   w.MILLISECONDS
    ,   w.FASTESTLAP
    ,   w.RANK
    ,   w.FASTESTLAPTIME
    ,   w.FASTESTLAPSPEED
    ,   w.STATUSID
  )
  values (p.RESULTID
         ,p.RACEID
         ,p.DRIVERID
         ,p.CONSTRUCTORID
         ,p.ORDINAL
         ,p.GRID
         ,p.POSITION
         ,p.POSITIONTEXT
         ,p.POSITIONORDER
         ,p.POINTS
         ,p.LAPS
         ,p.TIME
         ,p.MILLISECONDS
         ,p.FASTESTLAP
         ,p.RANK
         ,p.FASTESTLAPTIME
         ,p.FASTESTLAPSPEED
         ,p.STATUSID
  );
  l_rows := sql%rowcount;
  dbms_output.put_line('F1_RESULTS                    Merge phase 2 complete, inserted rows='||l_rows);

end;

Thus when you run f1_delta_wrapper.sql, the final phase of the script will apply these PLSQL blocks to each of the tables resulting in an output like this below


F1_CIRCUITS                   Merge phase 1 complete, updated rows=0
F1_CIRCUITS                   Merge phase 2 complete, inserted rows=0
F1_CONSTRUCTORRESULTS         Merge phase 1 complete, updated rows=0
F1_CONSTRUCTORRESULTS         Merge phase 2 complete, inserted rows=10
F1_CONSTRUCTORS               Merge phase 1 complete, updated rows=1
F1_CONSTRUCTORS               Merge phase 2 complete, inserted rows=0
F1_CONSTRUCTORSTANDINGS       Merge phase 1 complete, updated rows=0
F1_CONSTRUCTORSTANDINGS       Merge phase 2 complete, inserted rows=10
F1_DRIVERS                    Merge phase 1 complete, updated rows=0
F1_DRIVERS                    Merge phase 2 complete, inserted rows=0
F1_DRIVERSTANDINGS            Merge phase 1 complete, updated rows=0
F1_DRIVERSTANDINGS            Merge phase 2 complete, inserted rows=21
F1_LAPTIMES                   Merge phase 1 complete, updated rows=0
F1_LAPTIMES                   Merge phase 2 complete, inserted rows=1259
F1_PITSTOPS                   Merge phase 1 complete, updated rows=0
F1_PITSTOPS                   Merge phase 2 complete, inserted rows=25
F1_QUALIFYING                 Merge phase 1 complete, updated rows=0
F1_QUALIFYING                 Merge phase 2 complete, inserted rows=20
F1_RACES                      Merge phase 1 complete, updated rows=0
F1_RACES                      Merge phase 2 complete, inserted rows=0
F1_RESULTS                    Merge phase 1 complete, updated rows=0
F1_RESULTS                    Merge phase 2 complete, inserted rows=20
F1_SEASONS                    Merge phase 1 complete, updated rows=0
F1_SEASONS                    Merge phase 2 complete, inserted rows=0
F1_STATUS                     Merge phase 1 complete, updated rows=0
F1_STATUS                     Merge phase 2 complete, inserted rows=0
Changes not committed yet - don't forget to COMMIT or ROLLBACK

Now you can keep your Formula 1 racing data up to date every time new data appears on the site.

Enjoy!

Got some thoughts? Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.