Overview
This document outlines the results obtained by running the pseudoqueries specified in the ODE Metadata Bug Impact, Resolution, and Prevention document.
WYDOT's production ODE instance was updated to the new version with the metadata bug fix (release 1.0.6) on April 11, 2019 at 6:30PM Mountain Time which corresponds to a timezone-naive UTC timestamp of April 12, 2019, 12:30AM. For the purposes of this document, dateOfBugFix is considered to be the end of the day on 04/11/2019 (midnight 4/12, in UTC time, i.e. 2019-04-12T00:00:00.000Z ISO format).
Note that these timestamps are inclusive; all data on December 3rd and April 11th is analyzed.
Timeline
High-Level Information
Data Store:
Data Provider Source Analyzed: wydot
Findings
Query Number | Query Name | Pseudoquery | Result | Notes |
---|---|---|---|---|
1 | totalRecordCount | SELECT COUNT(*) WHERE AND
| ||
2 | timBroadcastRecordCount | SELECT COUNT(*) WHERE | ||
3 | goodOtherRecordCount | SELECT COUNT(*) WHERE metadata.odeReceivedAt >= 2/13/2019 AND metadata.odeReceivedAt < dateOfBugFix AND metadata.payloadType != us.dot.its.jpo.ode.model.OdeBsmPayload AND metadata.recordGeneratedBy != TMC | ||
4 | badBsmRecordCount | SELECT COUNT(*) WHERE metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt < dateOfBugFix AND metadata.payloadType == us.dot.its.jpo.ode.model.OdeBsmPayload | ||
5 | badOtherRecordCount | SELECT COUNT(*) WHERE metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt <= 2/12/2019 AND metadata.recordGeneratedBy != TMC AND metadata.payloadType != us.dot.its.jpo.ode.model.OdeBsmPayload | ||
6 | removableRecordCount | removableRecordCount = badBsmRecordCount + badOtherRecordCount | ||
7 | totalRecordCount (correctness) | totalRecordCount equal to (==) (timBroadcastRecordCount + removableRecordCount + goodOtherRecordCount) | ||
8 | earliestGeneratedAt |
| ||
9 | latestGeneratedAt |
| ||
10 | invalidRecordCount | invalidRecordCount = removableRecordCount | ||
11 | invalidS3FileCount |
| ||
12 | invalidS3Files |
| ||
13 | listOfLogFilesBefore |
| ||
14 | validRecordCount | SELECT COUNT(*) WHERE metadata.odeReceivedAt >= 2/13/2019 AND AND metadata.recordGeneratedBy != TMC AND
| ||
15 | validS3FileCount | SELECT COUNT(s3-filename) WHERE metadata.odeReceivedAt >= 2/13/2019 AND AND metadata.recordGeneratedBy != TMC AND [ | ||
16 | validS3Files | SELECT s3FileName WHERE AND metadata.recordGeneratedBy != TMC [AND metadata.serialId.bundleSize > 1 ] | ||
17 | listOfLogFilesAfter | SELECT metadata.logFilename WHERE AND metadata.recordGeneratedBy != TMC [ |
Data Cleanup Actions
The data cleanup process has three main steps: identifying the bad data, deleting it, and confirming success. The above queries have accomplished the first step of identifying the bad data. In this section we will discuss the process of safely and correctly deleting the bad data.
Step 1: Identify scope of bad data
This has been completed by the queries above and in particular the list of files generated by query 12.
Step 2: Delete the bad data
TBD
Step 3: Verify cleanup
TBD