Summary
WYDOT informed ODE team that the Driver Alert location data was coming out as zeros. ODE team investigated the data files provided by WYDOT team and confirmed the inconsistency. Further investigation into the code base revealed that a bug was introduced when changes enhanced metadata.serialId
field to allow more effective auditing and tracking of data in order to detect missing and duplicate records. The defect was introduced in the logic that calculates the metadata and NOT the payload; there was no impact to the data payload at any time.
...
An update to the LogFileToAsn1CodecPublisher class was made to support easier troubleshooting of missing or duplicate records and was included in (PR #263, commit). This class takes the raw payload data and wraps it in metadata to create a distribution-ready message. In this class, the metadata object is created once and then updated for each message that passes through. Reusing the metadata object instead of recreating it is done for performance reasons: it reduces memory usage and improves message processing time. The update changed the logic of how this metadata was populated and a statement for updating the metadata was missed. Therefore the metadata was not updated between each message and instead reused with the same information, resulting in the duplicate location and timestamp fields populated in all metadata generated for records in the same data file.
...
- Wyoming: Wyoming has removed all of the ODE generated data from their database during the 12/3/2018 to 2/12/2019 period and has re-uploaded all of the original log files back through ODE and into data store. However due the the bsmSource issue, all BSM data need to be removed and log files be uploaded again.
- DataHub: All affected Wyoming data with
metadata.odeReceivedAt >=
12/3/2018 until
AND metadata.recordGeneratedBydateOfBugFix
!= TMC
should be removed from S3 bucket. Also a notification message will be posted on DTG and Sandbox to alert users of the inconsistencies and the time frame of the impending correction. - SDC: All affected Wyoming data with
metadata.odeReceivedAt >=
12/3/2018 until
ANDdateOfBugFix
metadata.recordGeneratedBy
!= TMC
should be removed from data lake / raw submissions bucket and the Data Warehouse. Also a notification message will be posted on SDC site to alert users of the inconsistencies and the time frame of the impending correction. - End Users: A notification message will be posted to inform the users of the inconsistencies and the time frame of the impending correction.
...
For SDC, ODE team will work with the SDC team to remove invalid data using mutually agreed methods to identify invalid data and generate a report of affected files. Again, analysis of this report and any additional checks required by the data store maintainer, will determine the scope of the invalid data and resulting deletions from data lake / raw submissions bucket and the Data Warehouse.
The following metrics will aid in assuring the integrity of the data deposited to SDC and DataHub during the bug time period.
...
WYDOT | DataHub | SDC |
---|---|---|
|
|
|
...
WYDOT | DataHub | SDC |
---|---|---|
|
|
|
...
WYDOT | DataHub | SDC |
---|---|---|
|
|
|
...
WYDOT | DataHub | SDC |
---|---|---|
|
|
|
...
WYDOT | DataHub | SDC |
---|---|---|
|
|
|
...
WYDOT | DataHub | SDC |
---|---|---|
|
|
|
...
WYDOT | DataHub | SDC |
---|---|---|
|
|
|
...
WYDOT | DataHub | SDC |
---|---|---|
|
|
|
...
WYDOT | DataHub | SDC |
---|---|---|
|
|
|
...
WYDOT | DataHub | SDC |
---|---|---|
|
|
|
...
Task | Description | Owner | Target Completion Date | Actual Completion Date | |||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 |
| ODE Will Incorporate a QA Checklist and Script into their QA process. | BAH | 3/19/2019 (End of ODE Sprint 43) | 3/19/19 | ||||||||||||||||||||||||||||||||||||||||
2 |
| ODE will fix the missing bsmSource defect and run the QA Validation checklist and Scripts to verify the and regression test. This fix will be on top of the new SDW feature implementation. | BAH | 3/19/2019 | 3/19/19 | ||||||||||||||||||||||||||||||||||||||||
3 |
| ODE will deploy an AWS lambda function to check for data inconsistencies as they arrive on DataHub. Such inconsistencies may include SerialId serial numbers grossly out of order or repeating, time stamps repeating, required fields missing or null, etc. These Lambda functions will be shared with the community on GitHub so SDC and WYDOT teams will also be able to utilize it for data validation.
| BAH | 4/3/2019 | 4/4/19 | ||||||||||||||||||||||||||||||||||||||||
4 |
| ODE will release the validated ODE software to start UAT testing by WYDOT. | BAH | 3/19/2019 | 3/19/19 | ||||||||||||||||||||||||||||||||||||||||
5 |
| WYDOT will deploy jpo-ode-1.0.5 on their DEV server and perform UAT testing. The checklist and validation scripts will be available to them for test and validation. SDC team will test in their test environment. | WYDOT | 3/26/2019 | 3/25/19 | ||||||||||||||||||||||||||||||||||||||||
6 |
| This item is not related to metadata bug but for sake of coordination this task was inserted to assess whether it would be beneficial and more efficient to perform folder restructuring at the same time as data clean-up and before data re-upload. It was ultimately decided and approved by PO that folder restructuring adds unnecessary complexity to the clean-up process and schedule and is best to be deferred until after the data has been completely restored. See
Excerpt from approval email:
| BAH | 3/27/2019 | |||||||||||||||||||||||||||||||||||||||||
7 |
| All teams, ODE, Wyoming, DataHub and SDC collaborate on approach to remove invalid data from repositories. Meeting with WYDOT to establish UAT completion timeline.
| ALL | week of 3/20/2019 | 3/25/19 | ||||||||||||||||||||||||||||||||||||||||
8 |
| ODE, DataHub and SDC teams will verify that freshly arrived BSM, TIM and Driver Alert messages are correct and consistent. The validation checkers deployed to both systems should also verify data validity in DEV. COORDINATION MEETING TO CONFIRM ALL TEAM ARE RECEIVING CLEAN DATA FROM WYDOT GO/ NO-GO FOR WYDOT PROD DEPLOYMENT
|
|
4/11/2019 | 4/11/2019 | ||||||||||||||||||||||||||||||||||||||||
9 |
| UAT testing completed; Wyoming will promote v1.0.7 to PROD environment AFTER coordination meeting with stakeholders ref:
| WYDOT |
4/11/2019 | 4/11/2019 | ||||||||||||||||||||||||||||||||||||||||
10 |
| ODE collects counts as specified in the Useful variable declarations section. This exercise will identify earliest and latest generatedAt time for bad data. These counts can be used in cleanup verification.
|
| 4/17/2019 | |||||||||||||||||||||||||||||||||||||||||
11 |
| Run the queries mentioned in the Query the Count of Invalid Data section above. Note: BSM and TIMs Data Types may require separate cleanup processes as affected BSMs can likely be identified and removed without additional analysis. TIM messages will likely need additional effort to isolate affected TIM due to inability to re-upload unaffected Broadcast TIMs. Since DataHub repository only contains Broadcast TIM which were not affected by the bug, no action is required for DataHub TIM cleanup. Only invalid BSMs received from 12/3/2018-dateOfBugFix will need to be removed. SDC would need to remove all received TIMs for the received period of 12/3/2018 - 3/12/2019. WYDOT has already removed all invalid TIM and the original invalid BSM. Only BSM received from 2/13/2019 - dateofBugFix must be removed and uploaded again. WYDOT, SDC and DataHub will have all BSM records refreshed after the bug fix is deployed to WYDOT PROD server. WYDOT should review whether their deduplication software will remove erroneous data or retain the erroneous data. |
| 4/19/2019 | |||||||||||||||||||||||||||||||||||||||||
12 |
| Run the query mentioned in the Query a List of Invalid Data Files on S3 section above.
|
| 4/19/2019 | |||||||||||||||||||||||||||||||||||||||||
13 |
| Run the query mentioned in the Query a List of source data files resulting in invalid data section above.
|
| 4/19/2019 | |||||||||||||||||||||||||||||||||||||||||
14 |
| The results of the queries from tasks 9-12 should be aggregated into a report for summary and understanding to be presented to the product owner.
|
| 4/19/2019 | |||||||||||||||||||||||||||||||||||||||||
15 |
| COORDINATION MEETING CONFIRM ALL TEAMS' DATA REMOVAL PLANS WITH ALL STAKEHOLDERS ODE-1212 GO/ NO-GO on DATA REMOVAL |
|
Recovered Schedule and conducted meeting on 4/19/2019 | 4/19/2019 | ||||||||||||||||||||||||||||||||||||||||
16 |
| Once confidence in the summary findings is gained, the Lambda function used to run the queries in tasks 9-12 should be modified to delete the S3 files that are found using the queries. Running this function will be the cleanup step.
|
| 4/19/2019 | |||||||||||||||||||||||||||||||||||||||||
17 |
| The queries mentioned in the above section for Pseudo-queries for Validating the Clean Up:
should be run and ensured that the expected output matches the actual output.
|
| ||||||||||||||||||||||||||||||||||||||||||
18 |
| The Validation Checklist above should be iterated to ensure that the cleanup actions deleted the invalid data and did not affect the valid data. In addition to the automatic validation steps using the queries, manual inspection of the bucket should be performed as a sanity check.
|
| ||||||||||||||||||||||||||||||||||||||||||
19 |
| COORDINATION MEETING TO CONFIRM GO/ NO-GO ON WYDOT RE- UPLOAD DataHub and SDC expect the following to be re-uploaded by WYDOT:
| 4/26/2019 | ||||||||||||||||||||||||||||||||||||||||||
20 |
| WYDOT team starts re-uploading of all data files that were identified during the analysis phase as invalid and deposits them to respective data stores.
Due to inconsistencies between data stored in WYDOT database and SDC, investigations was initiated with the following results:
| WYDOT | 4/30/2019 | 4/30/2019 | ||||||||||||||||||||||||||||||||||||||||
21 |
|
|
| 5/14/2019 | 5/14/2019 | ||||||||||||||||||||||||||||||||||||||||
22 |
|
|
| 5/22/2019 | |||||||||||||||||||||||||||||||||||||||||
23 |
| On 5/23/2019, Brandon will re-upload all data from 12/3/2018 00:00:00 UTC through 5/15/2019 00:00:00 UTC. | WYDOT | 5/23/2019 | |||||||||||||||||||||||||||||||||||||||||
24 |
| DataHub will post a Release Note indicating that there are and will be duplicate records in the data until further notice. When that is depends on Lear’s fixing of the firmware. (Lien, Julia [USA] (Unlicensed), Michael Middleton (Unlicensed)) | BAH | 5/23/2019 | |||||||||||||||||||||||||||||||||||||||||
25 |
| Hamid Musavi (Unlicensed) will update the metadata bug Confluence page with current status, understanding and action plan. Meanwhile, copying @Ariel.Gold@dot.gov so she is informed of where we are in the re-upload process. | BAH | 5/17/2019 | |||||||||||||||||||||||||||||||||||||||||
26 |
| COORDINATION CLOSE OUT E-MAIL REPORTS TO CONFIRM VERIFICATION OF CLEAN UP COMPLETED
|
| 6/17/2019 | 5/3/2019 held meeting, but "NO-GO" - Data re-upload issue resolution ongoing with daily meet-up:
| ||||||||||||||||||||||||||||||||||||||||
27 |
| Communicate to all data users of the WYDOT, SDC, and document on DataHub that the cleanup is complete
| BAH | 5/31/2019 |
...
This is an update regarding errors in the Wyoming metadata that was communicated to the user community on 3/8/2019. The bug manifested itself in incorrect metadata fields in the Wyoming Basic Safety Message (BSM) and Traveler Information Message (TIM) data. All payload data included in BSM and TIM are unaffected and entirely correct.
...
As noted previously, the bug manifested itself in incorrect metadata fields in the Wyoming Basic Safety Message (BSM) and Traveler Information Message (TIM) data. All payload data included in BSM and TIM are unaffected and entirely correct. The ODE software version that contributed to the Metadata field errors had been corrected and deployed to Wyoming production server on 4/11/2019. No more invalid data is being deposited to the SDC and the ITS DataHub as of 4/12/2019 and all existing invalid data have been removed as of 4/26/2019.
...
We appreciate your patience over the last few months as we work through resolving resolved these metadata errors. If you have any question about these issues or have additional issues to report, please contact <insert appropriate email address here>.
...
We appreciate your patience over the last few months as we work through resolving resolved these metadata errors. If If you have any question about these issues, concerns about the reorganization of ITS Sandbox data, or additional data issues to report, please contact RDAE_Support@bah.com.
...