File size: 136,949 Bytes
bf691dd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
Agency	Name of Inventory Item	Description of Inventory Item	Primary Type of AI	Purpose of AI	Length of Usage	Does it directly impact the public?	Vendor  System	Other Notes
USDA	Agricultural Research Service - 4% Repair Dashboard	The model reviews the descriptions of expenses tagged to repairs and maintenance and classifies expenses as "repair" or "not repair" based on keywords in context.	Natural Language Processing	Classification or Labeling	Unknown	No impact		
USDA	Agricultural Research Service - Project Mapping	Term analysis and clustering enables program leaders to find synergies and patterns across ARS research program portfolios.	Natural Language Processing	Project Management	Unknown	No impact		
USDA	Agricultural Research Services - NAL Automated Indexing	Uses machine learning for indexing of publication abstracts and project proposals using terms from USDA National Agricultural Library Thesaurus	Machine Learning (Type Unknown)	Classification or Labeling	Unknown	No impact	Yes (Cogito)	
USDA	Forecasting Grasshopper Outbreaks in the Western United States using Machine Learning Tools	Integrate historic grasshopper survey data and grasshopper biology with environmental covariates (e.g., climate, soil, and topography) to generate grasshopper outbreaks forecasts for the western U.S.	Maximum Entropy Model (MaxEnt)	Forecasting & Prediction	Unknown	No impact		
USDA	Agricultural Research Services - Facial Recognition	Facial recognition as one of several factors for access to secure areas of a facility	Facial Recognition	Security	Unknown	Direct impact		
USDA	Economic Research Service - Coleridge Initiative	The purpose of this project is the use AI tools to understand how publicly funded data and evidence are used to serve science and society.	Natural Language Processing	Research (Other)	Unknown	No impact		
USDA	Economic Research Service - Westat	A competition to find automated, yet effective, ways of linking USDA nutrition information to 750K food items in a proprietary data set of food purchases and acquisitions	Natural Language Processing	Organization & Efficiency	Unknown	Indirect impact		
USDA	Farm Production and Conservation - Land Change Analysis Tool	Employ learning classifier to produce high resolution land cover maps from aerial and/or satellite imagery and publish results through publicly available Image service. Training data is generated from a custom-built web application.	Decision Tree Analysis (ie Random Forest or Gradient-Boosting)	Mapping	Unknown	No impact		LCAT has mapped over 600 million acres and have generated over 700 thousand training samples.
USDA	Food and Nutrition Services - Retailer Receipt Analysis	Use OCR on a sample of FNS receipt and invoice data; consultants will use this data to see how existing manual process can be automated, saving staff time, ensuring accurate review, and detecting difficult patterns. Will pave the way for a review system that (1) has an automated workflow and learns from analyst feedback (2) can incorporate known SNAP fraud patterns, look for new patterns, and visualize alerts on these patterns on retailer invoices and receipts.   	Optical Character Recognition (or Text Extraction)	Organization & Efficiency	Short-term project or study	Indirect impact		
USDA	Forest Service - Ecosystem Management Decision Support System (EMDS)	EMDS is a spatial decision support system for landscape analysis and planning that runs as a component of ArcGIS and QGIS.	Machine Learning (Type Unknown)	Mapping	Ongoing project (time unknown)	No impact		Users develop applications for their specific problem that may use any combination of four AI engines for 1) logic processing, 2) multi-criteria decision analysis, 3) Bayesian networks, and Prolog-based decision trees.
USDA	Forest Service - Wildland Urban Interface - Mapping Wildfire Loss	Uses machine learning to identify buildings, building loss, and defensible space around buildings before and after a wildfire event in wildland-urban interface settings.	Neural Networks	Mapping	Ongoing project (time unknown)	Indirect impact		Also uses image-based classification
USDA	Forest Service - National Land Cover Database (NLCD) Tree Canopy Cover Mapping	Responsible for producing maps with consistent spatial resolution. The forest structure maps are generated using over 60,000 training plots with a probabilistic sample design to train statistical machine learning models to classify continuous tree canopy cover.	Decision Tree Analysis (ie Random Forest or Gradient-Boosting)	Mapping	Unknown	No impact		
USDA	Forest Service - BigMAP project	The project uses machine learning, along with features derived from dense time series of Landsat imagery as well as climatic and topographic data, to impute attributes from national forest inventory database to produce raster maps of U.S. forest resources.	Machine Learning (Type Unknown)	Mapping	Unknown	No impact		
USDA	Forest Service - DISTRIB-II: Habitat Suitability of Eastern United States Trees	Habitat suitability is modeled for 125 eastern United States trees species under 1981-2010 climate conditions and 8 projected future conditions (2070-2099).	Unclear	Forecasting & Prediction	Short-term project or study	Indirect impact		The AI provides insight into options for managing eastern U.S. forests.
USDA	Forest Service - CLT Knowledge Database	The CLT knowledge database catalogs cross-laminated timber information in an interface that helps users find relevant information. The information system uses data aggregator bots that search the internet for relevant information. These bots search for hundreds of keywords and use machine learning to determine if what is found is relevant.	Machine Learning (Type Unknown)	Research (Other)	Ongoing project (time unknown)	No impact		As of 2/24/2022, the CLT knowledge database has cataloged >3,600 publications on various aspects of CLT. Manufacturers, researchers, design professionals, code officials, government agencies, and other stakeholders directly benefit from the tool, thereby supporting the increasing use of mass timber, which benefits forest health by increasing the economic value of forests.
USDA	Forest Service - RMRS Raster Utility	RMRS Raster Utility is a .NET object-oriented library that simplifies data acquisition, raster sampling, and statistical and spatial modeling while reducing the processing time and storage space associated with raster analysis.	Machine Learning (Type Unknown)	Organization & Efficiency	Ongoing project (time unknown)	No impact		
USDA	Forest Service - TreeMap 2016	TreeMap 2016 provides a tree-level model of the forests of the conterminous United States. It matches forest plot data from Forest Inventory and Analysis (FIA) to a 30x30 meter (m) grid. TreeMap 2016 is being used in both the private and public sectors for projects including fuel treatment planning, snag hazard mapping, and estimation of terrestrial carbon resources.	Decision Tree Analysis (ie Random Forest or Gradient-Boosting)	Mapping	Ongoing project (time unknown)	Indirect impact		A random forests machine-learning algorithm was used to impute the forest plot data to a set of target rasters provided by Landscape Fire and Resource Management Planning Tools (LANDFIRE).
USDA	Forest Service - Landscape Change Monitoring System (LCMS)	National Landsat/sentinel remote sensing-based data produced by the USDA Forest Service for mapping and monitoring changes related to vegetation canopy cover, as well as land cover and land use.	Unclear	Mapping	Ongoing project (time unknown)	No impact		
USDA	Forest Service - Forest Health Detection Monitoring	Machine learning models are used to (1) upscale training data that was collected from both the field and high-resolution imagery to map and monitor stages of forest mortality and defoliation, and (2) to post-process raster outputs to vector polygons.	Machine Learning (Type Unknown)	Mapping	Unknown	No impact		
USDA	Forest Service - Land Cover Data Development	Apply supervised classification methods with remotely sensed satellite aerial imagery and other landscape variables (e.g., digital elevation derivations, soils data, geology data, etc.) for labelling segments or pixels with land attributes for a landscape/study area.	Decision Tree Analysis (ie Random Forest or Gradient-Boosting)	Classification or Labeling	Unknown	No impact		
USDA	National Agricultural Statistics Service - Cropland Data Layer	Interpret readings from satellite-based sensors and classify the type of crop or activity that falls in each 30 square meter pixel on the ground.	Machine Learning (Type Unknown)	Classification or Labeling	Ongoing project (in production more than a year)	No impact		The CDL has been produced for national coverage since 2008.
USDA	National Agricultural Statistics Service - List Frame Deadwood Identification	The deadwood model produces a propensity score representing a relative likelihood of a farm operation being out of business. Common tree splits were identified using the model and combined with expert knowledge to develop a recurring process for deadwood clean up.	Decision Tree Analysis (ie Random Forest or Gradient-Boosting)	Forecasting & Prediction	Unknown	Indirect impact		
USDA	National Institute for Food and Agriculture - Climate Change Classification NLP	The model classifies NIFA funded projects as climate change related or not climate related through natural language processing techniques	Natural Language Processing	Classification or Labeling	Unknown	No impact		
USDA	Office of Safety, Security, and Protection - Video Surveillance System	The Video Surveillance System shall control multiple sources of video surveillance subsystems to collect, manage, and present video clearly and concisely	Facial Recognition	Security	Unknown	Direct impact		
USDA	Office of the Chief Information Officer - Acquisition Approval Request Compliance Tool	NLP model developed to utilize the text in procurement header and line descriptions within USDA's Integrated Acquisition System to determine the likelihood that an award is IT-related, and therefore might require an AAR.	Natural Language Processing	Classification or Labeling	Ongoing project (time unknown)	No impact		

USDA	Operational water supply forecasting for western US rivers	The USDA National Water and Climate Center operates the largest forecast system of spring-summer river flow volumes. The NWCC recently developed a next-generation prototype for generating such operational water supply forecasts, the multi-model machine-learning metasystem, which integrates a variety of AI and other data-science technologies carefully chosen or developed to satisfy specific user needs.	Machine Learning (Type Unknown)	Forecasting & Prediction	Ongoing project (time unknown)	Indirect impact		

USDED	Federal Student Aid - Aidan Chat-Bot	FSA's virtual assistant answers common financial aid questions and help customers get information about their federal aid on StudentAid.gov.	Natural Language Processing	Chat Bot	Ongoing project (time unknown)	Direct impact		In just over two years, Aidan has interacted with over 2.6 million unique customers, resulting in more than 11 million user messages.
USDOC	International Trade Administration - B2B Matchmaking	The system's algorithms and AI technology qualifies data and makes B2B matches with event participants according to their specific needs and available opportunities. 	Unclear	Classification or Labeling	Unknown	No impact		

USDOC	International Trade Administration - ChatBot Pilot	Chatbot embedded into trade.gov to assist ITA clients with FAQs, locating information and content, suggesting events and services. 	Natural Language Processing	Chat Bot	Unknown	Direct impact		

USDOC	International Trade Administration - Consolidated Screening List	The CSL search engine has “Fuzzy Name Search” capabilities, allowing a search without knowing the exact spelling of an entity’s name. In Fuzzy Name mode, the CSL returns a “score” for results that exactly or nearly match the searched name. 	Natural Language Processing	Forecasting & Prediction	Unknown	No impact		The Consolidated Screening List (CSL) is a list of parties for which the United States Government maintains restrictions on certain exports, reexports, or transfers of items. It consists of the consolidation of 13 export screening lists of the Departments of Commerce, State, and Treasury. 

USDOC	International Trade Administration - AD/CVD Self Initiation 	The ADCVD program investigates allegations of dumping and/or countervailing of duties. Investigations are initiated when a harmed US entity files a petition identifying the alleged offence and the specific harm inflicted. Self-Initiation will allow ITA to monitor trade patterns for this activity and preemptively initiate investigations by identifying harmed US entities.	Unclear	Classification or Labeling	Unknown	Indirect impact		

USDOC	International Trade Administration - Market Diversification Toolkit	A user enters what products they make and the markets they currently export to. The Market Diversification Tool applies a ML algorithm to identify and compare potential new export markets that should be considered. The tool brings together product-specific trade and tariff data and economy-level macroeconomic and governance data to provide a picture of which markets make sense for further market research. 	Machine Learning (Type Unknown)	Research (Other)	Unknown	No impact		

USDOC	NOAA - Fisheries Electronic Monitoring Image Library 	The Fisheries Electronic Monitoring Library (FEML) will be the central repository for electronic monitoring (EM) data related to marine life. 	Automated Image Processing	Monitoring or Detection	Planning or development stage	No impact		

USDOC 	NOAA - Passive acoustic analysis using ML in Cook Inlet, AK 	Passive acoustic data is analyzed for detection of beluga whales and classification of the different signals emitted by these species.	Neural Networks	Classification or Labeling	Ongoing project (time unknown)	No impact		Results are being used to inform seasonal distribution, habitat use, and impact from anthropogenic disturbance within Cook Inlet beluga critical habitat. The project is aimed to expand to other cetacean species as well as anthropogenic noise. 

USDOC	NOAA - AI-based automation of acoustic detection of marine mammals 	Command line software which was developed in-house for model training, evaluation, and deployment of machine learning models for the purpose of marine mammal detection in passive acoustic data.	Machine Learning (Type Unknown)	Monitoring or Detection	Ongoing project (time unknown)	No impact		It also includes annotation workflows for labeling and validation.

USDOC	NOAA - Developing automation to determine species and count using optical survey data in the Gulf of Mexico 	Focuses on optical survey collected in the Gulf of Mexico: 1) develops an image library of landed catch, 2) develops automated image processing (ML/DL) to identify and enumerate species from underwater imagery and 3) develops automated algorithms to process imagery in near real time and download information to central database. 	Automated Image Processing	Monitoring or Detection	Unknown	No impact		

USDOC	NOAA - Fast tracking the use of VIAME for automated identification of reef fish 	Compiling image libraries for use in creating automated detection and classification models for use in automating the annotation process for the SEAMAP Reef Fish Video survey of the Gulf of Mexico. VIAME models are performing well enough that we will incorporate automated analysis in video reads soon as part of a supervised annotation-qa/qc process. 	Automated Image Processing	Monitoring or Detection	Ongoing project (time unknown)	No impact		

USDOC	NOAA - A Hybrid Statistical-Dynamical System for the Seamless Prediction of Daily Extremes and Subseasonal to Seasonal Climate Variability 	Demonstrate the skill and suitability for operations of a statistical-dynamical prediction system that yields seamless probabilistic forecasts of daily extremes and sub seasonal-toseasonal temperature and precipitation.	Machine Learning (Type Unknown)	Forecasting & Prediction	Unknown	Indirect impact		Recently demonstrated a Bayesian statistical method for post-processing seasonal forecasts of mean temperature and precipitation from the North American Multi-Model Ensemble (NMME). 

USDOC	NOAA - FathomNet	FathomNet provides much-needed training data (e.g., annotated, and localized imagery) for developing machine learning algorithms that will enable fast, sophisticated analysis of visual data. 	Machine Learning (Type Unknown)	Research (Other)	Ongoing project (time unknown)	No impact		 

USDOC	NOAA - ANN to improve CFS T and P outlooks	Using Artificial Neural Networks to Improve CFS Week 3-4 Precipitation and Temperature Forecasts	Neural Networks	Forecasting & Prediction	Unknown	Indirect impact		

USDOC	NOAA - Drought outlooks by using ML techniques	Drought outlooks by using ML techniques with NCEP models	Machine Learning (Type Unknown)	Forecasting & Prediction	Ongoing project (time unknown)	Indirect impact		

USDOC	NOAA - EcoCast	Operational tool that uses boosted regression trees to model the distribution of swordfish and bycatch species in the California Current.	Decision Tree Analysis (ie Random Forest or Gradient-Boosting)	Mapping	Unknown	No impact		

USDOC	NOAA - Coastal Change Analysis Program (C-CAP)	C-CAP embarked on operational high resolution land cover development effort that utilized geographic object-based image analysis and ML algorithms such as Random Forest to classify coastal land cover from 1m multispectral imagery.	Automated Image Processing	Classification or Labeling	Ongoing project (in production more than a year)	No impact		

USDOC	NOAA - Deep learning algorithms to automate right whale photo id 	AI for right whale photo ID has expanded to include several algorithms to match right whales from different viewpoints (aerial, lateral) and body part (head, fluke, peduncle). 	Neural Networks	Classification or Labeling	Unknown	No impact		

USDOC	NOAA - NN Radiation	Developing fast and accurate NN LW- and SW radiations for GFS and GEFS.	Neural Networks	Forecasting & Prediction	Short-term project or study	No impact		

USDOC	NOAA - NN training software for the new generation of NCEP models 	Optimize NCEP EMC Training and Validation System for efficient handling of high spatial resolution model data produced by the new generation of NCEP's operational models.	Neural Networks	Organization & Efficiency	Unknown	No impact		
USDOC	NOAA - Coral Reef Watch	Offering the world's only global early-warning system of coral reef ecosystem physical environmental changes, CRW remotely monitors conditions that can cause coral bleaching, disease, and death; delivers information and early warnings in near real-time to our user community; and uses operational climate forecasts to provide outlooks of stressful environmental conditions at targeted reef locations worldwide. CRW products are primarily sea surface temperature (SST)-based but also incorporate light and ocean color, among other variables. 	Unclear	Forecasting & Prediction	Ongoing project (time unknown)	No impact		

USDOC	NOAA - Robotic microscopes and machine learning algorithms remotely and autonomously track lower trophic levels for improved ecosystem monitoring and assessment	Deploy the Imaging Flow Cytobot on fixed (docks) and roving (aboard survey ships) platforms to autonomously monitor phytoplankton communities in aquaculture areas in Puget Sound and in the California Current System. Map the distribution and abundance of phytoplankton functional groups and their relative food value to support fisheries and aquaculture and describe their changes in relation to ocean and climate variability and change. Automated taxonomic identification of imaged phytoplankton uses a supervised machine learning approach (random forest algorithm). 	Decision Tree Analysis (ie Random Forest or Gradient-Boosting)	Mapping	Unknown	No impact		

USDOC	NOAA - Edge AI survey payload development 	This is a nine camera (color, infrared, ultraviolet) payload controlled by dedicated on-board computers with GPUs. YOLO detection models run at a rate faster than image collection, allowing real-time processing of imagery as it comes off the cameras. Goals of effort are to reduce overall data burden and reduce the data processing timeline, expediting analysis and population assessment for arctic mammals. 	Automated Image Processing	Classification or Labeling	Unknown	No impact		

USDOC	NOAA - Ice seal detection and species classification in multispectral aerial imagery 	Refine detection and classification pipelines with the goal of reducing false positive rates (to < 50%) while maintaining > 90% accuracy and significantly reducing labor intensive, post survey review process. 	Automated Image Processing	Monitoring or Detection	Unknown	No impact		

USDOC	NOAA - First Guess Excessive Rainfall Outlook 	First guess for the WPC Excessive Rainfall Outlook - It is learned from the ERO with atmospheric variables.	Machine Learning (Type Unknown)	Forecasting & Prediction	Unknown	No impact		

USDOC	NOAA - CoralNet	Operational point annotation software for benthic photo quadrat annotation; development of classifiers allows for significantly reducing human annotation.	Machine Learning (Type Unknown)	Classification or Labeling	Unknown	No impact		

USDOC	NOAA - Automated detection of hazardous low clouds in support of safe and efficient transportation	Maintenance and sustainment project for the operational GOES-R fog/low stratus (FLS) products, routinely used by the NWS Aviation Weather Center and Weather Forecast Offices. 	Machine Learning (Type Unknown)	Monitoring or Detection	Unknown	No impact		The FLS products are derived from the combination of GOES-R satellite imagery and NWP data using machine learning.

USDOC	NOAA - The Development of ProbSevere v3 	ProbSevere is a ML model that utilizes NWP, satellite, radar, and lightning data to nowcast severe wind, severe hail, and tornadoes. ProbSevere v3 utilizes additional data sets and improved machine learning techniques to improve upon the operational version of ProbSevere. 	Machine Learning (Type Unknown)	Forecasting & Prediction	Unknown	Direct impact		ProbSevere v3 utilizes additional data sets and improved machine learning techniques to improve upon the operational version of ProbSevere. 

USDOC	NOAA - The VOLcanic Cloud Analysis Toolkit: System for detecting, tracking, characterizing, and forecasting hazardous volcanic events 	Consists of several AI powered satellite applications including: eruption detection, alerting, and volcanic cloud tracking. These applications are routinely utilized by Volcanic Ash Advisory Centers to issue volcanic ash advisories.	Unclear	Monitoring or Detection	Unknown	Direct impact		

USDOC	NOAA - SUVI Thematic Maps	The SUVI Thematic Maps product is a Level 2 data product that (presently) uses a machine learning classifier to generate a pixel-by-pixel map of important solar features digested from all six SUVI spectral channels. 	Machine Learning (Type Unknown)	Mapping	Unknown	No impact		

USDOL	Form Recognizer for Benefits Forms	Custom machine learning model to extract data from complex forms to tag data entries to field headers.	Machine Learning (Type Unknown)	Classification or Labeling	Ongoing project (in production less than six months)	Direct impact		Machine learning model uses computer vision.

USDOL	Claims Document Processing	Identifies if a physician’s note contains causal language by training custom natural language processing models	Natural Language Processing	Monitoring or Detection	Planning or development stage	Direct impact		Natural language processing for (a) document classification and (b) sentence-level causal passage detection

USDOL	Website Chatbot Assistant	The chatbot helps the end user with basic information about the program, information on who to contact, or seeking petition case status.	Natural Language Processing	Chat Bot	Planning or development stage	Direct impact		

USDOL	Data Ingestion of Payroll Forms	Custom machine learning model to extract data from complex forms to tag data entries to field headers.	Natural Language Processing	Classification or Labeling	Planning or development stage	Direct impact		

USDOL	Hololens	AI used by Inspectors to visually inspect high and unsafe areas from a safe location.	Unclear	Mapping	Ongoing project (in production more than a year)	No impact		

USDOL	SOII Computer-Assisted Coding	The Survey of Occupational Injuries and Illnesses (SOII) collects hundreds of thousands of narratives describing cases of work-related injury and illness annually. Autocoders assign classifications for worker occupation, nature of injury, part of body, event or exposure, source, and secondary source for each case.	Natural Language Processing	Classification or Labeling	Ongoing project (in production more than a year)	No impact		Autocoders subsequently expanded and coded 85% of all SOII elements for reference year (RY) 2019. This gradual increase occurred by adapting the selection criterion based on careful monitoring of the processes; project also uses deep neural networks with character-level convolutional embeddings and Long-Short-Term-Memory recurrent layers

USDOS	Bureau of Global Public Affairs - CLIPSLAB	GPA’s production media collection and analysis system that pulls data from half a dozen different open and commercial media clips services to give an up-to-date global picture of media coverage around the world.	Network Analysis (ie Bayesian, or Social Network)	Organization & Efficiency	Ongoing project (time unknown)	No impact		

USDOS	Bureau of Global Public Affairs - Mission Press Digest	A prototype system that collects and analyzes the daily media clips reports from about 70 different Embassy Public Affairs Sections.		Organization & Efficiency	Ongoing project (time unknown)	No impact		

USDOS	Bureau of Global Public Affairs - Digital Communications Database	GPA’s production system for collecting, analyzing, and summarizing the global digital content footprint of the Department.		Organization & Efficiency	Ongoing project (time unknown)	No impact		

USDOS	Bureau of Global Public Affairs - Facebook Ad Test Optimization System	GPA’s production system for testing potential messages at scale across segmented foreign sub-audiences to determine effective outreach to target audiences.		Forecasting & Prediction	Ongoing project (time unknown)	Direct impact		

USDOS	Bureau of Global Public Affairs - Global Audience Segmentation Framework	GPA’s prototype framework for predicting how content or messages designed for one audience some place in the world will resonate with other audiences outside the United States.		Forecasting & Prediction	Ongoing project (time unknown)	Direct impact		

USDOS	Bureau of Global Public Affairs - Machine-Learning Assisted Measurement and Evaluation of Public Outreach	A high-performing classifier capable of measuring the level of six different emotions that a text evokes, including time-series analysis, to help evaluate historical messages and predict successful future public messaging.	Machine Learning (Type Unknown)	Monitoring or Detection	Ongoing project (time unknown)	Direct impact		

USDOS	Bureau of Global Public Affairs - GPA Tools and GPAIX	AI-enabled analysis package for automating public outreach analysis.	Unclear	Research (Other)	Ongoing project (time unknown)	Direct impact		

USDOS	Bureau of Political-Military Affairs - Pull Information from Unstructured Text	Use natural language processing to extract information from document text to help summarize and allow for analysis more efficiently than manual methods.	Natural Language Processing	Organization & Efficiency	Ongoing project (time unknown)	No impact		

USDOS	Bureau of Political-Military Affairs - K-Means Clustering Into Tiers	Cluster countries into tiers based on data collected from open source and Bureau data using k-means clustering.	Clustering (K-means, etc.)	Classification or Labeling	Ongoing project (time unknown)	No impact		

USDOS	Global Engagement Center - Disinformation Topic Modeling	Text clustering and topic modeling of documents and social media to determine possible disinformation subjects and topics.	Clustering (K-means, etc.)	Mapping	Ongoing project (time unknown)	No impact		

USDOS	Global Engagement Center - Deepfake Detector	Classifies facial images as either being real (contains a real person’s face) or fake (synthetically generated face, a deepfake often created using Generative Adversarial Networks) to predict disinformation activities.	Neural Networks	Classification or Labeling	Ongoing project (time unknown)	No impact		

USDOS	Global Engagement Center - Text Similarity Detection	Identifies different texts that are identical or nearly identical by calculating cosine similarity between each pair of texts. Texts are then grouped if they share high cosine similarity and then available for analysts to review further.	Natural Language Processing	Research (Other)	Ongoing project (time unknown)	No impact		

USDOS	Global Engagement Center - Image Clustering for Disinformation Detection	Identifies similar images in order to analyze how images are used to spread and build traction with disinformation narratives.	Automated Image Processing	Research (Other)	Ongoing project (time unknown)	No impact		

USDOS	Global Engagement Center - Louvain Community Detection	Clusters nodes together into “communities” to detect clusters of accounts possibly spreading disinformation.	Network Analysis (ie Bayesian, or Social Network)	Research (Other)	Ongoing project (time unknown)	No impact		

USDOS	Office of U.S. Foreign Assistance Resources - Foreign Assistance Appropriations	Automates and streamlines the extraction of earmarks and directives from the annual appropriations bill to facilitate the Department’s adherence to congressional direction.	Natural Language Processing	Organization & Efficiency	Ongoing project (time unknown)	No impact		

USDOS	Office of Management Strategy and Solutions - Department Cables Analytics	Analysis of Department cables reporting to inform multiple areas of Department policy and operations.	Natural Language Processing	Project Management	Ongoing project (time unknown)	No impact		

USDOS	CSO - Automated Burning Detection	The Village Monitoring System program conducts daily scans of moderate resolution commercial satellite imagery to identify anomalies using the near-infrared band.	Machine Learning (Type Unknown)	Monitoring or Detection	Ongoing project (time unknown)	No impact		

USDOS	CSO - Automated Damage Assessments	The Conflict Observatory program analyzes moderate and high-resolution commercial satellite imagery to document a variety of war crimes and other abuses in Ukraine, including automated damage assessments of a variety of buildings, including critical infrastructure, hospitals, schools, crop storage facilities.	Machine Learning (Type Unknown)	Monitoring or Detection	Ongoing project (in production less than a year)	No impact		

USDVA	Physical Therapy App	A data source agnostic tool which takes input from a variety of wearable sensors and then analyzes the data to give feedback to the physical therapist in an explainable format.	Unclear	Monitoring or Detection	Unknown	Direct impact		

USDVA	Coach in Cardiac Surgery	Infers misalignment in team members’ mental models during complex healthcare task execution. Of interest are safety-critical domains (e.g., aviation, healthcare), where lack of shared mental models can lead to preventable errors and harm. Identifying model misalignment provides a building block for enabling computer-assisted interventions to improve teamwork and augment human cognition in the operating room.	Unclear	Organization & Efficiency	Unknown	Direct impact		

USDVA	AI Cure	A phone app that monitors adherence to orally prescribed medications during clinical or pharmaceutical sponsor drug studies.	Unclear	Monitoring or Detection	Unknown	Direct impact		

USDVA	Acute Kidney Injury	Focuses on detecting acute kidney injury (AKI), ranging from minor loss of kidney function to complete kidney failure. The artificial intelligence can also detect AKI that may be the result of another illness.	Unclear	Monitoring or Detection	Unknown	Direct impact		Project is in collaboration with Google DeepMind

USDVA	Assessing lung function in health and disease	Determines predictors of normal and abnormal lung function and sleep parameters.	Unclear	Forecasting & Prediction	Unknown	Direct impact		

USDVA	Automated eye movement analysis and diagnostic prediction of neurological disease	Recursively analyzes previously collected data to both improve the quality and accuracy of automated algorithms, as well as to screen for markers of neurological disease (e.g. traumatic brain injury, Parkinson's, stroke, etc).	Unclear	Monitoring or Detection	Unknown	Direct impact		
USDVA	Automatic speech transcription engines to aid scoring neuropsychological tests.	Automated speech transcription engines analyze the cognitive decline of older VA patients. Digitally recorded speech responses are transcribed using multiple artificial intelligence-based speech-to-text engines. The transcriptions are fused together to reduce or obviate the need for manual transcription of patient speech in order to score the neuropsychological tests.	Speech-to-Text	Organization & Efficiency	Unknown	Direct impact		
USDVA	Curapatient	Allows patients to better manage their conditions without having to see a provider. It allows patients to create a profile to track their health, enroll in programs, manage insurance, and schedule appointments.	Unclear	Organization & Efficiency	Unknown	Direct impact		
USDVA	Digital Command Center	Seeks to consolidate all data in a medical center and apply predictive prescriptive analytics to allow leaders to better optimize hospital performance.	Unclear	Forecasting & Prediction	Unknown	Direct impact		
USDVA	Disentangling dementia patterns using artificial intelligence on brain imaging and electrophysiological data	Predict the various patterns of dementia seen on MRI and EEG and explore the use of these imaging modalities as biomarkers for various dementias and epilepsy disorders. The VA is performing retrospective chart review to achieve this.	Neural Networks	Forecasting & Prediction	Unknown	Direct impact		
USDVA	Enhanced diagnostic error detection and ML classification of protein electrophoresis text	Researchers are performing chart review to collect true/false positive annotations and construct a vector embedding of patient records, followed by similarity-based retrieval of unlabeled records "near" the labeled ones (semi-supervised approach). Embedding inputs will be selected high-value structured data pertinent to stroke risk and possibly selected text notes.	Support Vector Machines	Organization & Efficiency	Unknown	Indirect impact		
USDVA	Behavidence	Veterans download the app onto their phone and it compares their phone usage to that of a digital phenotype that represents people with confirmed diagnosis of mental health conditions.	Unclear	Monitoring or Detection	Unknown	Direct impact		Seems very invasive
USDVA	Tools to predict outcomes of hospitalized VA patients	An IRB-approved study which aims to examine machine learning approaches to predict health outcomes of VA patients. It will focus on the prediction of Alzheimer's disease, rehospitalization, and Chlostridioides difficile infection.	Machine Learning (Type Unknown)	Forecasting & Prediction	Short-term project or study	Direct impact		

USDVA	Nediser	A continuously trained “radiology resident” that assists radiologists in confirming the X-ray properties in their radiology reports. Nediser can select normal templates, detect hardware, evaluate patella alignment and leg length and angle discrepancy, and measure Cobb angles.	Unclear	Monitoring or Detection	Unknown	Direct impact		

USDVA	Precision medicine PTSD and suicidality diagnostic and predictive tool	Interprets real time inputs to forewarn episodes of PTSD and suicidality, support early and accurate diagnosis of the same, and gain a better understanding of the short and long term effects of stress, as it relates to the onset of PTSD.	Unclear	Forecasting & Prediction	Unknown	Direct impact		

USDVA	Prediction of Veterans' Suicidal Ideation following Transition from Military Service	Model uses relevant data from a web-based survey of veterans’ experiences within three months of separation and every six months after for the first three years after leaving military service to predict veterans' suicidal ideation.	Machine Learning (Type Unknown)	Forecasting & Prediction	Unknown	Direct impact		

USDVA	PredictMod	Determines if predictions can be made about diabetes based on the gut microbiome.	Unclear	Forecasting & Prediction	Unknown	Direct impact		

USDVA	Predictor Profiles of OUD and overdose	Evaluates the interactions of known and novel risk factors for opioid use disorder (OUD) and overdose in Post-9/11 Veterans.	Decision Tree Analysis (ie Random Forest or Gradient-Boosting)	Forecasting & Prediction	Unknown	Direct impact		

USDVA	Provider directory data accuracy and system of record alignment	AI is used to add value as a transactor for intelligent identity resolution and linking. AI also has a domain cache function that can be used for both Clinical Decision Support and for intelligent state reconstruction over time and real-time discrepancy detection. As a synchronizer, AI can perform intelligent propagation and semi-automated discrepancy resolution.	Unclear	Organization & Efficiency	Unknown	No impact		

USDVA	Seizure detection from EEG and video	Uses EEG and video data from a VHA epilepsy monitoring unit in order to automatically identify seizures without human intervention.	Machine Learning (Type Unknown)	Monitoring or Detection	Unknown	Direct impact		

USDVA	SoKat Suicidial Ideation Detection Engine	Improves identification of Veteran suicide ideation from survey data collected by the Office of Mental Health Veteran Crisis Line support team.	Natural Language Processing	Monitoring or Detection	Unknown	Indirect impact		

USDVA	Predict perfusionists’ critical decision-making during cardiac surgery	Builds predictive models of perfusionists’ decision-making during critical situations that occur in the cardiopulmonary bypass phase of cardiac surgery. 	Machine Learning (Type Unknown)	Forecasting & Prediction	Unknown	Direct impact		Results may inform future development of computerized clinical decision support tools to be embedded into the operating room, improving patient safety and surgical outcomes.

USDVA	Gait signatures in patients with peripheral artery disease	Previously collected biomechanics data is used to identify representative gait signatures of PAD to 1) determine the gait signatures of patients with PAD and 2) the ability of limb acceleration measurements to identify and model the meaningful biomechanics measures from PAD data.	Machine Learning (Type Unknown)	Monitoring or Detection	Unknown	Direct impact		

USDVA	MedSafe Clinical Decision Support	Analyzes current clinical management for diabetes, hypertension, and chronic kidney disease, and makes patient-specific, evidence-based recommendations to primary care providers.	Unclear	Monitoring or Detection	Unknown	Direct impact		The system uses knowledge bases that encode clinical practice guideline recommendations and an automated execution engine to examine multiple comorbidities, laboratory test results, medications, and history of adverse drug events in evaluating patient clinical status and generating patient-specific recommendations

USDVA	Prediction of health outcomes, including suicide death, opioid overdose, and decompensated outcomes of chronic diseases	Using electronic health records (EHR) (both structured and unstructured data) as inputs, this tool outputs deep phenotypes and predictions of health outcomes including suicide death, opioid overdose, and decompensated outcomes of chronic diseases.	Unclear	Forecasting & Prediction	Unknown	Direct impact		

USDVA	VA-DoE Suicide Exemplar Project	Improves VA's ability to identify Veterans at risk for suicide through three closely related projects that all involve collaborations with the Department of Energy.	Unclear	Monitoring or Detection	Unknown	Direct impact		
USDVA	Disease progression of hepatitis C virus	Predicts disease progression among veterans with hepatitis C virus.	Machine Learning (Type Unknown)	Forecasting & Prediction	Unknown	Direct impact		
USDVA	Biologic response to thiopurines	Predictsbiologic response to thiopurines among Veterans with irritable bowel disease.	Unclear	Forecasting & Prediction	Unknown	Direct impact		
USDVA	Predicting hospitalization and corticosteroid use as a surrogate for IBD flares	Examines data from 20,368 Veterans Health Administration patients with an irritable bowel disease diagnosis between 2002 and 2009. Longitudinal labs and associated predictors were used to predict hospitalizations and steroid usage as a surrogate for IBD Flares.	Decision Tree Analysis (ie Random Forest or Gradient-Boosting)	Forecasting & Prediction	Short-term project or study	Direct impact		
USDVA	Predicting corticosteroid free endoscopic remission with Vedolizumab in ulcerative colitis	Predicts the outcome of corticosteroid-free biologic remission at week 52 on the testing cohort.	Decision Tree Analysis (ie Random Forest or Gradient-Boosting)	Forecasting & Prediction	Short-term project or study	Direct impact		This work is focused on a cohort of 594 patients. Models were constructed using baseline data or data through week 6 of VDZ therapy.
USDVA	Predict surgery in Crohn’s disease	Analyzes patient demographics, medication use, and longitudinal laboratory values collected between 2001 and 2015 from adult patients in the Veterans Integrated Service Networks 10 cohort. The data was used for analysis in prediction of Crohn’s disease and to model future surgical outcomes within one year.	Machine Learning (Type Unknown)	Forecasting & Prediction	Short-term project or study	Direct impact		
USDVA	Reinforcement learning evaluation of treatment policies for patients with hepatitis C virus	Predicts disease progression among veterans with hepatitis C virus.	Machine Learning (Type Unknown)	Forecasting & Prediction	Unknown	Direct impact		
USDVA	Predicting hepatocellular carcinoma in patients with hepatitis C	Examines whether deep learning recurrent neural network (RNN) models that use raw longitudinal data extracted directly from electronic health records outperform conventional regression models in predicting the risk of developing hepatocellular carcinoma (HCC).	Neural Networks	Forecasting & Prediction	Short-term project or study	No impact		This prognostic study used data on patients with hepatitis C virus (HCV)-related cirrhosis in the national Veterans Health Administration who had at least 3 years of follow-up after the diagnosis of cirrhosis.
USDVA	Computer-aided detection and classification of colorectal polyps	The models receive video frames from colonoscopy video streams and analyze them in real time in order to (1) detect whether a polyp is in the frame and (2) predict the polyp's malignant potential.	Unclear	Forecasting & Prediction	Short-term project or study	Direct impact		

USDVA	GI Genius (Medtronic)	Aids in detection of colon polyps.	Unclear	Monitoring or Detection	Unknown	Direct impact		

USDVA	Extraction of family medical history from patient records	Uses TIU documentation on African American Veterans aged 45-50 to extract family medical history data and identify Veterans who are are at risk of prostate cancer but have not undergone prostate cancer screening.	Unclear	Forecasting & Prediction	Short-term project or study	Direct impact		

USDVA	VA/IRB approved research study for finding colon polyps	Uses a randomized trial for finding colon polyps with artifical intelligence.	Unclear	Monitoring or Detection	Short-term project or study	Indirect impact		

USDVA	Interpretation/triage of eye images	Triages eye patients cared for through telehealth, interprets eye images, and assesses health risks based on retina photos. The goal is to improve diagnosis of a variety of conditions, including glaucoma, macular degeneration, and diabetic retinopathy.	Automated Image Processing	Monitoring or Detection	Unknown	Direct impact		

USDVA	Screening for esophageal adenocarcinoma	National VHA administrative data is used to adapt tools that use electronic health records to predict the risk for esophageal adenocarcinoma.	Unclear	Forecasting & Prediction	Unknown	Direct impact		

USDVA	Social determinants of health extractor	AI is used with clinical notes to identify social determinants of health (SDOH) information. The extracted SDOH variables can be used during associated health related analysis to determine, among other factors, whether SDOH can be a contributor to disease risks or healthcare inequality.	Unclear	Forecasting & Prediction	Unknown	Direct impact		

EPA	Predict exposure pathways	Chemical structure and physicochemical properties were used to predict the probability that a chemical might be associated with any of four exposure pathways leading from sources-consumer (near-field), dietary, far-field industrial, and far-field pesticide-to the general population. We then used exposure pathways to organize predictions from 13 different exposure models as well as other predictors of human intake rates. We created a consensus, meta-model using the Systematic Empirical Evaluation of Models framework.	Decision Tree Analysis (ie Random Forest or Gradient-Boosting)	Forecasting & Prediction	Short-term project or study	Indirect impact		The balanced accuracies of these source-based exposure pathway models range from 73 to 81%, with the error rate for identifying positive chemicals ranging from 17 to 36%.

EPA	Records categorization	Predict the retention schedule for records; the model will be incorporated into a records management application to help users apply retention schedules when they submit new records.	Machine Learning (Type Unknown)	Forecasting & Prediction	Ongoing project (time unknown)	No impact		

EPA	Enforcement Targeting	Improves enforcement of environmental regulations through facility inspections by the EPA and state partners.	Unclear	Forecasting & Prediction	Ongoing project (time unknown)	No impact		The resulting predictive analytics showed a 47% improvement of identifying violations of the Resource Conservation and Recovery Act.

HHS	Health Resources and Services Administration Electronic Handbooks AI Chatbot	Built to allow grantees to communicate with the EHBs Chatbot using regular natural conversational expressions; provides knowledge- and action-based responses through a self-service platform with 24/7 availability	Natural Language Processing	Chat Bot	Ongoing project (time unknown)	Direct impact		

HHS	Health Resources and Services Administration BHW Community Need Analysis Platform	Allows for BHW to dynamically assess the healthcare need of a population given a specific use case and relevant datasets. The output of the model will be used as part of the Notice of Funding Opportunity (NOFO) grant proposal evaluation process.	Machine Learning (Type Unknown)	Forecasting & Prediction	Ongoing project (time unknown)	Indirect impact		The first use case being developed is for primary care with behavioral health integration which uses a machine learning based automated clustering engine.

HHS	CDC - ICD-10 Coding of Cause of Death reported on Death Certificates (MedCoder)	MedCoder ICD-10 cause of death codes to the literal text cause of death description provided by the cause of death certifier on the death certificate.	Unclear	Organization & Efficiency	Unknown	No impact		

HHS	CDC - Item Nonresponse Detection in Open-text Response Data	Developing an item nonresponse detection model, to identify cases of item nonresponse (e.g., gibberish, uncertain/don’t know, refusals, or high-risk) among open-text responses to help improve survey data and question and questionnaire design.	Natural Language Processing	Organization & Efficiency	Planning or development stage	No impact		The system is a Natural Language Processing (NLP) model pre-trained using Contrastive Learning and fine-tuned on a custom dataset from survey responses.

HHS	CDC - Sequential Coverage Algorithm (SCA) in Record
 Linkage	Used to develop joining methods (or blocking groups) when working with very large datasets. The SCA method improved the efficiency of blocking.	Machine Learning (Type Unknown)	Organization & Efficiency	Ongoing project (time unknown)	No impact		

HHS	CMS - Chatbot - Voice	Assists the CMS Badging Help Desk with an automated phone response for general badging questions allowing help desk personnel to assist employees and contractors with more detailed/larger issues.	Natural Language Processing	Chat Bot	Ongoing project (time unknown)	No impact		

HHS	CMS - Chatbot - Text	Assists the Security team with an automated email response for general physical security questions, allowing the help desk team to assist employees and contractors with more in depth issues.	Natural Language Processing	Chat Bot	Ongoing project (time unknown)	No impact		

HHS	CMS - Feedback Analysis Solution 	Uses CMS or other publicly available data (such as Regulations.Gov) to review public comments and/or analyze other information from internal and external stakeholders.	Natural Language Processing	Organization & Efficiency	Unknown	No impact		The FAS uses Natural Language Processing (NLP) tools to aggregate, sort and identify duplicates to create efficiencies in the comment review process. FAS also uses machine learning (ML) tools to identify topics, themes and sentiment outputs for the targeted dataset.

HHS	CMS - Predictive Intelligence - Incident Assignment for
 Quality Service Center	Analyzes the short description provided by the end user in order to find key words with previously submitted incidents and assigns the ticket to the appropriate assignment group.	Natural Language Processing	Organization & Efficiency	Unknown	Indirect impact		Predictive Intelligence (PI) is used for incident assignment within the Quality Service Center (QSC). The solution runs on incidents created from the ServiceNow Service Portal. This solution is re-trained with the incident data in our production instance every 3-6 months based on need.

HHS	CMS - Reasonable Accomodation RPA Bot	Pulls HR data related to staffing changes, e.g. promotions, reassignments, change in supervisor, and generates information for action by Reasonable Accommodation staff to ensure disability reasonable accommodations follow the employee.	Natural Language Processing	Organization & Efficiency	Unknown	No impact		

HHS	CMS - Rapid Authority to Operate	Used to identify common blocks of language used in similar ways across system security plan (SSP) documents. CMS could identify similar approaches to solving certain technology or process-related control areas within the Acceptable Risk Safeguards. The output was used to create a list of components to develop control description language in a re-usable way, as part of the Blueprint/Rapid ATO effort to streamline SSP generation for new systems.	Natural Language Processing	Security	Unknown	No impact		

HHS	CMS - Data Lake/Load-Extract-Load-Transform (L-ETL) 	Modernizes the load-extract-load-transform (L-ETL) pipelines and data tooling. CMS will be enhancing Agency security to bring together more system, telemetry and program data in one place with a unifying governance model. 	High-Power Computing	Security	Planning or development stage	No impact		There is no actual ML/AI work being done here today, rather, we are beginning work on the scaffolding that will open up these opportunities in 1-2 years time.

HHS	CMS - Priority Score Model	Ranks providers within the Fraud Prevention System using logistic regression based on program integrity guidelines.	Regression Analysis	Mapping	Unknown	Indirect impact		Inputs - Medicare Claims data, Targeted Probe and Educate (TPE) Data, Jurisdiction information

HHS	CMS - Priority Score Timeliness	Forecast the time needed to work on an alert produced by Fraud Prevention System	Decision Tree Analysis (ie Random Forest or Gradient-Boosting)	Forecasting & Prediction	Unknown	No impact		

HHS	CMS - Provider Education 90 Day	Reviews claims for provider before and after education for statistical change in their claim submission patterns	Regression Analysis	Research (Other)	Unknown	No impact		

HHS	FDA - Advanced Semantic Search and Indexing of Text
 for Tobacco Applications (ASSIST4Tobacco)	Uses semantic indexing to search tobacco authorization applications.	Natural Language Processing	Organization & Efficiency	Planning or development stage	No impact		

HHS	FDA - Artificial Intelligence-based Deduplication Algoirthm for Classfication of Duplicate Reports in the FDA Adverse Event Reports (FAERS)	The deduplication algorithm is applied to nonpublic data in the FDA Adverse Event Reporting System (FAERS) to identify duplicate reports; structured and unstructured data are used in a probabilistic record linkage approach to score pairs of reports by evaluating multiple data fields and applying relative weights per field.	Natural Language Processing	Organization & Efficiency	Unknown	No impact		The output of potential duplicate reports is further placed in groups to facilitate identification of FAERS reports during case series evaluation for safety issues of concern.

HHS	FDA - Opioid Data Warehouse Term Identification and
 Novel Synthetic Opioid Detection and Evaluation
 Analytics	Uses publicly available social media and forensic chemistry data to identify novel referents to drug products in social media text.	Network Analysis (ie Bayesian, or Social Network)	Research (Other)	Unknown	No impact		It uses the FastText library to create vector models of each known NSO-related term in a large social media corpus, and provides users with similarity scores and expected prevalence estimates for lists of terms that could be used to enhance future data gathering efforts

HHS	NIH - National Institute of General Medical Sciences (NIGMS) AI Supported Searches, Information Systems and Tools System	Provides the ability to identify investigators by PPID from Federal RePORTER based on user input of investigator PPIDs; provides the ability to lookup potential matching program officers, including their corresponding predicted Program Area Codes, and ICs based on the input of unstructured scientific data.	Natural Language Processing	Organization & Efficiency	Ongoing project (time unknown)	No impact		DIMA and IRMB have collaborated to develop functions that utilize artificial intelligence and natural language processing methods to produce data relevant to the NIGMS program staff’s mission. These tools are collected into a single system to make them available to the NIGMS community for use on a day-to-day basis.

HHS	NIH - Leveraging AI for Business Process Automation	Automates the initial referral of grant applications to the proper scientific expertise within the Institute.	Natural Language Processing	Organization & Efficiency	Ongoing project (time unknown)	No impact		NIGMS IRMB and DIMA are currently using this NLP/ML algorithm developed in R statistical software to parse grant applications and to determine Project Officer candidates for grant assignment. This process was previously fully manual and required a substantial person hour effort.

HHS	NIH - Grant Application Subject-Matter Classification
 Tool	Classifies grant applications for review assignment.	Natural Language Processing	Classification or Labeling	Ongoing project (time unknown)	No impact		

HHS	NIH - Splunk IT System Monitoring Software	Aggregates system logs from IT infrastructure systems and endpoints for auditing and monitoring purposes.	Machine Learning (Type Unknown)	Monitoring or Detection	Ongoing project (time unknown)	No impact		

HHS	NIH - COVID-19 Pandemic Vulnerability Index
 Dashboard	Creates risk profiles, called PVI scorecards, for every county in the United States, continuously updated with the latest data that summarize and visualize overall disease risk.	Unclear	Mapping	Unknown	No impact		

HHS	NIH - Leveraging AI/ML for classification and
 categorization of scientific concepts	Used for topical characterization of the research portfolio.	Machine Learning (Type Unknown)	Classification or Labeling	Unknown	No impact		Inputs are publications and grants abstracts. These are fed into a text classification model and concept extraction. The outputs are category labels and list of concepts.

HHS	NIH - Machine learning system to predict translational
 progress in biomedical research	Detects whether a paper is likely to be cited by a future clinical trial or guideline.	Machine Learning (Type Unknown)	Forecasting & Prediction	Unknown	No impact		Despite the noisiness of citation dynamics, as little as 2 years of postpublication data yield accurate predictions about a paper’s eventual citation by a clinical article (accuracy = 84%, F1 score = 0.56; compared to 19% accuracy by chance).

HHS	NIH - Semantic analysis of scientific documents 	Computationally converts words in scientific texts to numbers and summarizes documents by their semantic content by learning relationships between words from their context. 	Neural Networks	Research (Other)	Unknown	No impact		This method is adaptable to specific corpora, including grants and scientific articles.

HHS	NIH - Person-level disambiguation for PubMed authors
 and NIH grant applicants 	Determines whether author-publication pairs refer to variant representations of the same person; for example, model can determine whether hypothetical records listing Jane Smith and Jane M. Smith were the same person, or two different people, based on variables that include institutional affiliation, co-authorship, and article-affiliated Medical Subject Heading (MeSH) terms.	Neural Networks	Research (Other)	Ongoing project (time unknown)	No impact		High-quality disambiguation is required to correctly link researchers to their grants and outputs including articles, patents, and clinical trials.

HHS	NIH - Program Class Code (Area of Science) Referral for
 NIAID	Evaluates the projects that are in Referral, Program Analysis Branch and auto assigns these grant applications to the Program Class Codes.	Unclear	Classification or Labeling	Planning or development stage	No impact		The inputs are comprised of approximately 6,000+ grant applications that are currently manually assigned by RPAB Staff. The output would be grant applications that are categorized into their respective PCC's.
HHS	NIH - Research, Condition, and Disease Categorization	RCDC is an electronic budget reporting tool that categorizes projects using AI/NLP. The inputs are grant applications, R&D contracts, intramural projects, inter agency agreements. 	Natural Language Processing	Classification or Labeling	Ongoing project (time unknown)	No impact		
HHS	NIH - Query View Report (QVR) LIKE	The LIKE feature in QVR makes use of the NIH Research, Condition and Disease Categorization (RCDC) indexing results to compare scientific terms associated with a project, person or publication and find scientifically similar projects, persons or publications.	Unclear	Classification or Labeling	Unknown	No impact		
HHS	NIH - Internal Referral Module	Automatically refers projects to Program Officers once the grant application is received.	Natural Language Processing	Organization & Efficiency	Ongoing project (time unknown)	No impact		This process, is operating at a high accuracy rate and has effectively eliminated the referral bottleneck.
HHS	NIH Grants Virtual Assistant	Chat Bot to assist users in finding grant related information via OER resources.	Natural Language Processing	Chat Bot	Ongoing project (time unknown)	No impact		
HHS	NIH - Pangolin lineage classifications to support
 accessing and analysis of SARS-CoV-2 sequence
 data	The Pango nomenclature, called Pango lineages, is being used by researchers and public health agencies worldwide to track the transmission and spread of SARS-CoV-2, including variants of concern.	Decision Tree Analysis (ie Random Forest or Gradient-Boosting)	Monitoring or Detection	Unknown	No impact		
HHS	NIH - Providing MeSH Check Tag of NLM’s Medical
 Text Indexer (MTI) ons using Support Vector
 Machines	Provides confidence scores for a set of MeSH CheckTags to the NLM Medical Text Indexer (MTI) program; these CheckTags are small set of MeSH Descriptors designed to indicate Species, Sex, and Age in MEDLINE articles.	Machine Learning (Type Unknown)	Classification or Labeling	Unknown	No impact		
HHS	NIH - Determining selection for indexing MEDLINE
 articles using Neural Network Architecture with a
 Convolutional Neural Network	Uses a sigmoid activation function to generate a single output value between zero and one, which can be interpreted as the probability of an article being in-scope for MEDLINE.	Neural Networks	Forecasting & Prediction	Unknown	No impact		
HHS	NIH - MetaMap to identity potential terms for indexing
 MEDLINE articles	Provides a link between the text of biomedical literature and the knowledge, including synonymy relationships, embedded in the Metathesaurus	Natural Language Processing	Mapping	Unknown	No impact		
HHS	NIH - Best Match: New relevance search for PubMed	Best Match is a new relevance search algorithm for PubMed that leverages the intelligence of our users and cutting-edge machine-learning technology as an alternative to the traditional date sort order.	Machine Learning (Type Unknown)	Research (Other)	Ongoing project (time unknown)	No impact		PubMed is a free search engine for biomedical literature accessed by millions of users from around the world each day. With the rapid growth of biomedical literature, finding and retrieving the most relevant papers for a given query is increasingly challenging.
HHS	NIH - SingleCite: Improving single citation search in
 PubMed	SingleCite is an automated algorithm that establishes a query-document mapping by building a regression function to predict the probability of a retrieved document being the target based on three variables: the score of the highest scoring retrieved document, the difference in score between the two top retrieved documents, and the fraction of a query matched by the candidate citation.	Regression Analysis	Mapping	Ongoing project (time unknown)	No impact		
HHS	NIH - Computed Author: author name disambiguation
 for PubMed	Author name ambiguity may lead to irrelevant retrieval results; we developed a machine-learning method to score the features for disambiguating a pair of papers with ambiguous names.	Machine Learning (Type Unknown)	Research (Other)	Ongoing project (time unknown)	No impact		
HHS	NIH - National Library of Medicine NLM-Gene: towards
 automatic gene indexing in PubMed articles	An automatic tool for finding gene names in the biomedical literature.	Natural Language Processing	Organization & Efficiency	Ongoing project (time unknown)	No impact		
HHS	NIH - National Library of Medicine NLM-Chem:
 towards automatic chemical indexing in PubMed
 articles	An automatic tool for finding chemical names in the biomedical literature.	Natural Language Processing	Organization & Efficiency	Ongoing project (time unknown)	No impact		
HHS	OIG - Grants Analytics Portal	Enhances staff’s ability to access grants related data quickly and easily by: quickly navigating directly to the text of relevant findings across thousands of audits, the ability to discover similar findings, analyze trends, compare data between OPDIVs, and the means to see preliminary assessments of potential anomalies between grantees.	Unclear	Organization & Efficiency	Ongoing project (time unknown)	No impact		
HHS	OIG - Text Analytics Portal	Allows personnel without an analytics background to quickly examine text documents through a related set of search, topic modeling and entity recognition technologies.	Unclear	Organization & Efficiency	Ongoing project (time unknown)	No impact		
HHS	AHRQ - Relevancy Tailoring	Adjusts the ranking of search results so that most relevant results show up at the top of the list.	Unclear	Organization & Efficiency	Ongoing project (time unknown)	No impact		
HHS	AHRQ - Auto-generation Synonyms	Adds synonyms to search queries to improve search results.	Unclear	Organization & Efficiency	Ongoing project (time unknown)	No impact		
HHS	AHRQ - Automated Suggestions	Auto-fills queries as they are typed.	Machine Learning (Type Unknown)	Organization & Efficiency	Ongoing project (time unknown)	No impact		
HHS	AHRQ - Suggested Related Content	Shows related searches that may provide the user with other related, valuable information.	Machine Learning (Type Unknown)	Organization & Efficiency	Ongoing project (time unknown)	No impact		
HHS	AHRQ - Auto Tagging	Suggests content tags automatically based on a machine-driven evaluation of how existing content is tagged.	Machine Learning (Type Unknown)	Organization & Efficiency	Ongoing project (time unknown)	No impact		
HHS	AHRQ - Did you mean	Suggests spelling corrections and reformatted search queries based on Google Analytics data.	Machine Learning (Type Unknown)	Organization & Efficiency	Ongoing project (time unknown)	No impact		
HHS	AHRQ - Chatbot	Responds to plain language queries in real time.	Natural Language Processing	Research (Other)	Ongoing project (time unknown)	No impact		
DOJ	Drug Signature Program Algorithms	Automatically classifies the geographical region of origin of samples selected for DEA's Heroin and Cocaine signature programs. The system provides for detection of anomalies and low confidence results.	Machine Learning (Type Unknown)	Classification or Labeling	Ongoing project (in production more than a year)	No impact	Agency-generated	Data not available publicly but 1-2 unclassified summary reports for each program are released publicly each year.

DOJ	Complaint Lead Value Probability	Helps to triage immediate threats in order to help FBI field offices and law enforcement respond to the most serious threats first. Based on the algorithm score, highest priority tips are first in the queue for human review.	Unclear	Classification or Labeling	Ongoing project (in production more than a year)	Indirect impact	Agency-generated	Threat Intake Processing System (TIPS) database uses artificial intelligence (AI) algorithms to accurately identify, prioritize, and process actionable tips in a timely manner.

DOJ	Intelligent Records Consolidation Tool	Assesses the similarity of records schedules across all Department records schedules. The tool provides clusters of similar items to significantly reduce the time that the Records Manager spends manually reviewing schedules for possible consolidation.	Natural Language Processing	Organization & Efficiency	Ongoing project (in production more than a year)	No impact		An AI powered dashboard provides recommendations for schedule consolidation and review, while also providing the Records Manager with the ability to review by cluster or by individual record.

DOJ	Privileged Material Identification	Scans documents and looks for attorney/client privileged information. It does this based on keyword input by the system operator.	Optical Character Recognition (or Text Extraction)	Monitoring or Detection	Ongoing project (in production less than six months)	No impact		

DOT	Technical Operations Predictive Maintenance	Utilize equipment telemetry data and statistical modeling to predict equipment failures before they occur in order to improve operational efficiency and safety by reducing unscheduled outages and/or shortening outage times as replacement equipment can be pre-positioned in anticipation of the failure.	Machine Learning (Type Unknown)	Organization & Efficiency	Planning or development stage	Indirect impact		

DOT	Surface Report Classifiier (SCM/Auto-Class)	SCM classifies surface incident reports by event type, such as Runway Incursion, Runway Excursion, Taxiway Incursion/Excursion and categorizes runway incursions further by severity type.	Support Vector Machines	Classification or Labeling	Planning or development stage	No impact		

DOT	Regulatory Compliance Mapping Tool	RCMT processes all the documents’ paragraphs to extract the meaning (semantics) of the text. RCMT then employs a recommender system (also using some AI technology) to take the texts augmented by the texts’ meaning to establish candidate matches between the ICAO SARPs and FAA text that provides means of compliance.	Natural Language Processing	Organization & Efficiency	Ongoing project (in production less than a year)	No impact		The AVS International office is required to identify means of compliance to ICAO Standards and Recommended Practices (SARPs). Both SARPs and means of compliance evidence are text paragraphs scattered across thousands of pages of documents. AOV identified a need to find each SARP, evaluate the text of many FAA Orders, and suggest evidence of compliance based upon the evaluation of the text. The base dataset used by RCMT is the documents’ texts deconstructed into paragraphs.

DOT	Fusion and analysis of safety event reporting data	Integrates safety event reporting data from aircraft operators and manufacturers in support of data-driven decision making. The development of an ontology helped standardize and integrate data from across disconnected sources. 	Natural Language Processing	Organization & Efficiency	Planning or development stage	No impact		

DOT	JASC Code classification in Safety Difficulty Reports (SDR)	Derives the joint aircraft system codes (JASC) chapter codes from the narrative description within service difficulty reports (SDR), a form of safety event reporting from aircraft operators.	Natural Language Processing	Classification or Labeling	Planning or development stage	No impact		

DOT	Safety risk classification	Uses features from Safety Management Tracking System (SMTS) hazards, projects, Safety Risk Management Documents (SRMDs), requirements, and SPT's to predict an initial risk level. 	Natural Language Processing	Forecasting & Prediction	Planning or development stage	No impact		
DOT	Feature importance modeling	Uses regression models to determine statistical correlations between AOV indicators with AJI's ASM and Surface Safety Metric (SSM) metrics.	Machine Learning (Type Unknown)	Research (Other)	Planning or development stage	No impact		

DOT	Anomaly Detection	Detects variances in Remote Monitoring and Logging System (RMLS) log outages to predict whether an outage should or should not be an SIE.	Neural Networks	Forecasting & Prediction	Planning or development stage	No impact		

DOT	Use Case to Identify Energy Signatures	Leveraging fusion data to correlate unstable approaches to energy “signatures” on approach.	Unclear	Research (Other)	Planning or development stage	No impact		

DOT	Offshore Precipitation Capability (OPC)	OPC leverages data from several sources such as weather radar, lightning networks, satellite and numerical models to produce a radar-like depiction of precipitation. The algorithm then applies machine learning techniques based on years of satellite and model data to improve the accuracy of the location and intensity of the precipitation areas.	Machine Learning (Type Unknown)	Mapping	Ongoing project (in production more than a year)	No impact		

DOT	Course Deviation Identification for Multiple Airport Route Separation (MARS)	May enable deconfliction of airports in high-demand metropolitan areas. To build necessary collision risk models for the safety case, several models are needed, including one that describes the behavior of aircraft that fail to navigate the procedure correctly. 	Machine Learning (Type Unknown)	Forecasting & Prediction	Planning or development stage	Indirect impact		

DOT	Determining Surface Winds with Machine Learning Software	Analyzes camera images of a wind sock to produce highly accurate surface wind speed and direction information in remote areas that don’t have a weather observing sensor.	Automated Image Processing	Mapping	Planning or development stage	No impact		

DOT	Remote Oceanic Meteorological Information Operations (ROMIO)	Evaluates the feasibility to uplink convective weather information to aircraft operating over the ocean and remote regions. Capability converted weather satellite data, lightning and weather prediction model data into areas of thunderstorm activity and cloud top heights. 	Neural Networks	Mapping	Planning or development stage	No impact		

SSA	Insight	Analyzes the free text of disability decisions and other case data to offer adjudicators real-time alerts on potential quality issues and case-specific reference information within a web application; offers adjudicators a series of interactive tools to help streamline their work.  	Natural Language Processing	Monitoring or Detection	Unknown	Indirect impact		

SSA	Intelligent Medical Langage Analysis Generation (IMAGEN)	Analyzes clinical text from disability applicants health records and transforms it to data and other useful formats to enable disability adjudicators to more easily find and identify clinical content that is relevant to SSA’s disability determination process. 	Natural Language Processing	Classification or Labeling	Unknown	Indirect impact		

SSA	Duplicate Identification Process (DIP)	Helps the user to identify and flag duplicate pages and documents within the disability electronic folder more efficiently, reducing the amount of task time associated with preparing cases for SSA's ALJ Hearings.	Automated Image Processing	Organization & Efficiency	Unknown	No impact		
SSA	Handwriting recognition from forms	Parses handwritten entries on specific standard forms submitted by clients.	Optical Character Recognition (or Text Extraction)	Organization & Efficiency	Unknown	Indirect impact		
GSA	Acquisition Analytics	Takes Detailed Data on transactions and classifies each transaction within the Government-wide Category Management Taxonomy.	Natural Language Processing	Classification or Labeling	Ongoing project (time unknown)	No impact		
GSA	Category Taxonomy Refinement	Uses token extraction from product descriptions more accurately shape intended markets for PSCs.	Natural Language Processing	Classification or Labeling	Ongoing project (time unknown)	No impact		
GSA	Chatbot for Federal Acquisition Community	Streamlines the customer experience process, and automates providing answers to documented commonly asked questions through public facing knowledge articles.	Natural Language Processing	Chat Bot	Planning or development stage	Direct impact		
GSA	City Pairs Program Ticket Forecast and Scenario Analysis Tools	Takes segment-level City Pair Program air travel purchase data and creates near-term forecasts for the current and upcoming fiscal year by month and at various levels of granularity including DOD vs Civilian, Agency, and Region.	Unclear	Forecasting & Prediction	Planning or development stage	No impact		
GSA	Classifying Qualitative Data with Medallia	Users can create rules based on words and their relationships with other words to tag qualtitative data with our topics (passports, tax refunds, etc.); also offers sentiment analysis.	Natural Language Processing	Classification or Labeling	Ongoing project (time unknown)	No impact		
GSA	Contract Acquisition Lifecycle Intelligence (CALI)	Streamlines the evaluation of vendor proposals.	Machine Learning (Type Unknown)	Project Management	Planning or development stage	No impact	Offered by Octo Consulting	
GSA	Enterprise Brain	Document repository that improves document discovery.	Unclear	Monitoring or Detection	Planning or development stage	No impact	Document repository from tanjo.ai	
GSA	Key KPI Forecasts for GWCM	Takes monthly historical data for underlying components used to calculate KPIs and creates near-term forecasts for the upcoming fiscal year.	Unclear	Forecasting & Prediction	Planning or development stage	No impact		
GSA	OAS Kudos Chatbot	Captures employee peer-to-peer recognition.	Natural Language Processing	Chat Bot	Planning or development stage	No impact		
GSA	ServiceNow Generic Ticket Classification	Takes generic Service Now tickets and classify them so that they can be automatically re-routed to the correct team that handles these types of tickets.	Natural Language Processing	Classification or Labeling	Planning or development stage	No impact		
GSA	Solicitation Review Tool (SRT)	The SRT intakes SAM.gov data for all ICT solicitations. The system then compiles the data into a database to be used by machine learning algorithms. The first of these is a Natural Language Processing model that determines if a solicitation contains compliance language. If a solicitation does not have compliance language, then it is marked as non-compliant. 	Natural Language Processing	Classification or Labeling	Ongoing project (time unknown)	No impact		
GSA	Survey Comment Ham / Spam tester	Determines which comments on USA.gov comments section are worth the time of analysts reading.	Natural Language Processing	Classification or Labeling	Planning or development stage	Indirect impact		
DHS	Sentiment Analysis and Topic Modeling (SenTop)	Intially, analyzed survey responses for DHS’s Office of the Chief Procurement Officer related to contracting; currently, serves as general-purpose text analytics solution that can be applied to any domain/area.	Natural Language Processing	Project Management	Ongoing project (in production more than a year)	No impact		
DHS	CISA - AIS Scoring & Feedback (AS&F)	AIS enables the real-time exchange of machine-readable cyber threat indicators and defensive measures to help protect against and ultimately reduce the prevalence of cyber incidents; AS&F specifically performs descriptive analytics from organizational-centric intelligence to support confidence and opinion classification of indicators of compromise.	Machine Learning (Type Unknown)	Classification or Labeling	Ongoing project (in production less than six months)	No impact		
DHS	CISA - Automated PII Detection	Automatically detects potential PII from within Automated Indicator Sharing submissions. If submissions are flagged for possible PII, the submission will be queued for human review where the analysts will be provided with the submission and artificial intelligence-assisted guidance to the specific PII concerns	Natural Language Processing	Monitoring or Detection	Ongoing project (in production less than six months)	Indirect impact		
DHS	TSA - CDC Airport Hotspot Throughput (PageRank)	Determines the domestic airports that have the highest rank of connecting flights during the holiday travel season to help mitigate the spread of COVID-19.	Unclear	Mapping	Ongoing project (in production more than a year)	No impact		This capability is a DHS-developed artificial intelligence model written in Spark/Scala that takes historical non-PII travel data and computes the highest-ranking airports based on the PageRank algorithm.
DHS	USCG - Silicon Valley Innovation Program (SVIP) Language Translator	Supports the Coast Guard in facilitating real-time communications with non-English speakers and those who are unable to communicate verbally. The solicitation also included requirements for language translation technology to be capable of operating both online and offline because many Coast Guard interactions take place in extreme environmental conditions, and in locations without cell service or an internet connection.	Speech-to-Text	Organization & Efficiency	Ongoing project (in production more than a year)	No impact		
DHS	USCIS - Asylum Text Analytics (ATA)	Identifies plagiarism-based fraud in applications for asylum status and for the withholding of removal by scanning the digitized narrative sections of the associated forms and looking for common language patterns.	Machine Learning (Type Unknown)	Monitoring or Detection	Ongoing project (in production more than a year)	Indirect impact		
DHS	USCIS - BET/FBI Fingerprint Success Maximization	Enables technicians to receive immediate feedback when a set of prints is likely to be rejected by the FBI; aims to maximize the number of successful FBI submissions while minimizing the number of fingerprint recaptures necessary. 	Machine Learning (Type Unknown)	Classification or Labeling	Ongoing project (in production less than a year)	Indirect impact		USCIS's Customer Profile Management Service (CPMS) serves as a person-centric repository of biometric and biographic information provided by applicants and petitioners (hereafter collectively referred to as “benefit requestors”) that have been issued a USCIS card evidencing the granting of an immigration related benefit (i.e., permanent residency, work authorization, or travel documents).

DHS	USCIS - Biometrics Enrollment Tool (BET) Fingerprint Quality Score	Takes a fingerprint image and assigns a score between 0 - 100, with 100 indicating that this is the best quality fingerprint image that could be obtained. The higher the score, the more likely that the fingerprint will match when captured again.	Machine Learning (Type Unknown)	Classification or Labeling	Ongoing project (time unknown)	Indirect impact		

DHS	USCIS - Evidence Classifier	Systematically tags individual pages with some of the highest-volume, highest-impact evidence types to help case workers sort through certain immigration request forms.	Machine Learning (Type Unknown)	Classification or Labeling	Ongoing project (in production more than a year)	No impact		

DHS	USCIS - FDNS-DS NexGen	Aids the Fraud Detection and National Security (FDNS) Directorate in investigative work, enhances investigative case prioritization, and detects duplicate case work.	Machine Learning (Type Unknown)	Research (Other)	Ongoing project (in production more than a year)	Indirect impact		

DHS	USCIS - Sentiment Analysis	USCIS issued a two-part survey asking users both quantitative and qualitative questions and then assigned "sentiments" to categories ranging from strongly positive to strongly negative.	Natural Language Processing	Monitoring or Detection	Ongoing project (in production more than a year)	Indirect impact		

DHS	USCIS - Testing Performance of ML Model using H2O	Determines the most used categories for applicants submitting I-90's, and machine learning to create predictions of workloads.	Machine Learning (Type Unknown)	Forecasting & Prediction	Ongoing project (in production more than a year)	Indirect impact		
DHS	USCIS - Timeseries Analysis and Forecasting	Used Autoregressive Integrated Moving Average (ARIMA) models on the I-90 form, which allowed the prediction of the total number of forms for a 2-year period.	Regression Analysis	Forecasting & Prediction	Ongoing project (in production more than a year)	No impact		
DHS	CBP - Agent Portable Surveillance	Identifies border activities of interest by analyzing data from Electro-Optical/Infra-Red cameras and radar.	Machine Learning (Type Unknown)	Security	Ongoing project (in production more than a year)	Indirect impact		
DHS	CBP - Autonomous Surveillance Towers	Scans constantly and autonomously; radar detects and recognizes movement; and the camera slews autonomously to the items of interest and the system software identifies the object; analyzes the camera and radar data which alerts the user and autonomously tracks the item of interest.	Machine Learning (Type Unknown)	Monitoring or Detection	Ongoing project (in production more than a year)	Indirect impact		
DHS	CBP - I4 Viewer Matroid Image Analysis	Enables CBP end users to create and share vision detectors; Matroid detectors are trained computer vision models that recognize objects, people, and events in any image and in video streams.	Automated Image Processing	Monitoring or Detection	Ongoing project (time unknown)	Indirect impact		
DHS	CBP - Open-source News Aggregation	Enables users to make better decisions faster by identifying and forecasting emerging events on a global scale to mitigate risk, recognize threats, greatly enhance indications and warnings, and provide predictive intelligence capabilities	Network Analysis (ie Bayesian, or Social Network)	Forecasting & Prediction	Ongoing project (in production more than a year)	No impact		
DHS	ICE - Data Tagging and Classification	RAVEn leverages data tracking and classification to do the following: streamline how special agents and criminal analysts search, filter, translate, and report on electronic communications evidence and will help investigators more effectively determine the structure and organization of criminal enterprises; send and receive leads and enter outcomes such as arrests and seizures; improve the efficiency of agents and analysts in identifying pertinent evidence, relationships, and criminal networks from data extracted from mobile devices.	Machine Learning (Type Unknown)	Classification or Labeling	Ongoing project (in production less than six months)	No impact		The Homeland Security Investigations (HSI) Innovation Lab is developing an analytical platform called the Repository for Analytics in a Virtualized Environment (RAVEn). RAVEn facilitates large, complex analytical projects to support ICE’s mission to enforce and investigate violations of U.S. criminal, civil, and administrative laws.
DHS	ICE - Language Translator	Used to increase the efficiency, accuracy, and quality of searching, analyzing, and translating speech.	Natural Language Processing	Classification or Labeling	Ongoing project (in production more than a year)	No impact		
DHS	ICE - RAVEn Compliance Automation Tool (CAT)	Increase the speed and efficiency of ingesting and processing Forms I-9 data.	Optical Character Recognition (or Text Extraction)	Organization & Efficiency	Ongoing project (in production less than six months)	Indirect impact		RAVEn CAT currently employs an Optical Recognition Service (OCR) model and software (Tesseract OCR) to identify pixel coordinates of handwritten and read/extract computer typed characters from ingested forms for processing.
DHS	ICE - RAVEn Normalization Services	Service to verify, validate, correct, and normalize the accuracy and quality of addresses, phone numbers, and names.	Machine Learning (Type Unknown)	Classification or Labeling	Ongoing project (in production more than a year)	Indirect impact		The normalization services let agents analyze both well-defined addresses (such as those in CONUS and Europe) and less well-defined addresses (such as addresses using mile markers); standardize phone numbers to their identified country and to the E164 ITU standard; and streamline the process of correcting data entry errors and/or pointing out purposeful misidentification, connecting information about a person across HSI datasets, and cutting down the number of resource hours needed for investigations.
DOE	Advances in Nuclear Fuel Cycle Nonproliferation, Safeguards, and Security Using an Integrated Data Science Approach	Develops a digital twin of a centrifugal contactor system that receives data from traditional and real time sensors, constructs a digital representation or simulation of the chemical separations component within the nuclear fuel cycle, and performs data analysis through machine learning to determine anomalies, failures, and trends.	Machine Learning (Type Unknown)	Project Management	Planning or development stage	No impact		
DOE	Development of a multi-sensor data science system used for signature development on solvent extraction processes conducted within Beartooth facility	Utilizes non-traditional measurement sources such as vibration, acoustics, current, and light, and traditional sources such as flow, and temperature in conjunction with data-based, machine learning techniques that will allow for signal discovery. The goal is to characterize stages within a solvent extraction process can increase target metals recovery, indicate process faults, account for special nuclear material, and inform near real-time decision making.	Machine Learning (Type Unknown)	Project Management	Planning or development stage	No impact		
DOE	Scalable Framework of Hybrid Modeling with Anticipatory Control Strategy for Autonomous Operation of Modular and Microreactors	Validates novel and scalable models to achieve faster-than-real-time prediction and decision-making capabilities; analyzes the risk of cascading
 failures when emerging reactors are deployed as part of a full feeder microgrid.
	Machine Learning (Type Unknown)	Forecasting & Prediction	Unknown	No impact		
DOE	Accelerating and Improving the Reliability of Low Failure Probability Computations to Support the Efficient Safety Evaluation and Deployment of Advanced Reactor Technologies	Reduces the computational burden by reducing the number of finite element evaluations when estimating low failure probabilities.	Unclear	Organization & Efficiency	Planning or development stage	No impact		These will be implemented in the Multiphysics Object-Oriented Simulation Environment, which will help the nuclear engineering community to efficiently conduct probabilistic failure analyses and uncertainty quantification studies for the design and optimization of advanced reactor technologies.
DOE	Accelerating deployment of nuclear fuels through reduced-order thermophysical property models and machine learning	Develops a novel physics-based tool that combines 1) reduced-order models, 2) machine learning algorithms, 3) fuel performance methods, and 4) state-ofthe-art thermal property characterization equipment and irradiated nuclear fuel data sets to accelerate nuclear fuel discovery, development, and deployment.	Machine Learning (Type Unknown)	Research (Other)	Planning or development stage	No impact		
DOE	Promoting Optimal Sparse Sensing and Sparse Learning for Nuclear Digital Twins	Addresses the efficient use of limited experimental data available for nuclear digital twin (NDT) training and demonstration. This involves developing sparse data reconstruction methods and using NDT models to define sensor requirements (location, number, accuracy) for the design of demonstration experiments.	Unclear	Research (Other)	Planning or development stage	No impact		
DOE	Artificial Intelligence Enhanced Advanced Post Irradiation Examination	Uncovers the relationships between micro/nanoscale structure, zirconium phase redistribution, local thermal conductivity, and engineering scale fuel properties; shows how artificial intelligence (AI)-based technology can facilitate and accelerate nuclear fuel development.	Neural Networks	Research (Other)	Unknown	No impact		
DOE	Secure Millimeter Wave Spectrum Sharing with Autonomous Beam Scheduling	Exploits the millimeter wave beam directionality and utilizes the beam sensing capabilities at end devices to prove that an autonomous radio frequency beam scheduler can support secure 5G spectrum sharing and guarantee optimality for base stations.	Unclear	Research (Other)	Unknown	No impact		
DOE	Objective-Driven Data Reduction for Scientific Workflows	Aims to develop theories and algorithms for objective-driven reduction of scientific data in workflows that are composed of various models, including datadriven AI models.	Unclear	Research (Other)	Planning or development stage	No impact		
DOE	The Grid Resilience and Intelligence Platform (GRIP)	Develops metrics that quantify the impact of the anticipated weather related extreme events. The platform uses utility data combined with physical models, distribution power solver to infer the potential grid impacts given a major storm.	Unclear	Forecasting & Prediction	Unknown	No impact		
DOE	Open-Source High-Fidelity Aggregate Composite Load Models of Emerging Load Behaviors for Large-Scale Analysis (GMLC 0064)	Estimates the load composition data and motor protection profiles for different climante regions in the Western US; calibrates the parameters of WECC composite load model to match the responses with detailed feeder model.
	Decision Tree Analysis (ie Random Forest or Gradient-Boosting)	Forecasting & Prediction	Unknown	No impact		
DOE	Big Data Synchrophasor Monitoring and Analytics for Resiliency Tracking (BDSMART)	Explore the use of big data tools on phasor measurement unit data to identify and improve existing knowledge, and to discover new insights and tools for better grid operation and management.	Machine Learning (Type Unknown)	Monitoring or Detection	Unknown	No impact		
DOE	Combinatorial Evaluation of Physical Feature Engineering and Deep Temporal Modeling for Synchrophasor Data at Scale	Explore the use of big data tools on phasor measurement unit data to identify and improve existing knowledge, and to discover new insights and tools for better grid operation and management.	Machine Learning (Type Unknown)	Monitoring or Detection	Unknown	No impact		
DOE	MindSynchro	Explore the use of big data tools on phasor measurement unit data to identify and improve existing knowledge, and to discover new insights and tools for better grid operation and management.	Machine Learning (Type Unknown)	Monitoring or Detection	Unknown	No impact		
DOE	PMU-Based Data Analytics Using Digital Twin Phasor Analytics Software	Explore the use of big data tools on phasor measurement unit data to identify and improve existing knowledge, and to discover new insights and tools for better grid operation and management.	Machine Learning (Type Unknown)	Monitoring or Detection	Unknown	No impact		
DOE	A Robust Event Diagnostic Platform: Integrating Tensor Analytics and Machine Learning Into Real-time Grid Monitoring	Explore the use of big data tools on phasor measurement unit data to identify and improve existing knowledge, and to discover new insights and tools for better grid operation and management.	Machine Learning (Type Unknown)	Monitoring or Detection	Unknown	No impact		
DOE	Discovery of Signatures, Anomalies, and Precursors in Synchrophasor Data with Matrix Profile and Deep Recurrent Neural Networks	Explore the use of big data tools on phasor measurement unit data to identify and improve existing knowledge, and to discover new insights and tools for better grid operation and management.	Machine Learning (Type Unknown)	Monitoring or Detection	Unknown	No impact		
DOE	Machine Learning Guided Operational Intelligence	Explore the use of big data tools on phasor measurement unit data to identify and improve existing knowledge, and to discover new insights and tools for better grid operation and management.	Machine Learning (Type Unknown)	Monitoring or Detection	Unknown	No impact		
DOE	Robust Learning of Dynamic Interactions for Enhancing Power System Resilience	Explore the use of big data tools on phasor measurement unit data to identify and improve existing knowledge, and to discover new insights and tools for better grid operation and management.	Machine Learning (Type Unknown)	Monitoring or Detection	Unknown	No impact		
DOE	Artificial Intelligence Based Process Control and Optimization for Advanced Manufacturing	Develops the capability to intelligently control and optimize advanced manufacturing processes instead of the existing trial and error approach.	Neural Networks	Project Management	Unknown	No impact		Artificial intelligence-based control algorithms will be developed by employing deep reinforcement learning. To reduce the computational expense with advanced manufacturing models, physics-informed reduced order models (ROMs) will be developed. The AI-based control algorithms will employ the ROMs’ predictions to adaptively inform processing decisions in a simulation environment.
DOE	Smart Contingency Analysis Neural Network for in-depth Power Grid Vulnerability Analyses	Machine learning framework and resilience-chaos plots are leveraged to reduce computational expense required to discover, with 90% accuracy, n-2 contingencies by 50%.	Machine Learning (Type Unknown)	Organization & Efficiency	Unknown	No impact		
DOE	Resilient Attack Interceptor for Intelligent Devices	Focuses on developing external monitoring methods to protect industrial internet of things devices by correlating observable physical aspects that are produced naturally and involuntarily during the operational lifecycle with anomalous functionality.	Unclear	Monitoring or Detection	Ongoing project (time unknown)	No impact		
DOE	Infrastructure eXpression	Translates industrial control system features to a machine-readable format for use with automated cyber tools.	Unclear	Research (Other)	Short-term project or study	No impact		This project’s success can serve as the foundation for prioritizing the next research steps to realize automated threat response, improving the timeliness and fidelity of cyber incident consequence models, and enriching national capabilities to share actionable threat intelligence at machine speed.
DOE	Protocol Analytics to enable Forensics of Industrial Control Systems	Discovers methods and technologies to bridge gaps between the various industrial control systems (ICS) communication protocols and standard Ethernet to enable existing cybersecurity tools defend ICS networks and empower cybersecurity analysts to detect compromise before threat actors can disrupt infrastructure, damage property, and inflict harm.	Machine Learning (Type Unknown)	Research (Other)	Unknown	Indirect impact		
DOE	Automated Type and Data Structure Resolution	Identified and labeled type and structure data in an automated and scalable way such that the information can be used in other tools and other Reverse Engineering at Scale research areas such as symbolic execution.	Machine Learning (Type Unknown)	Classification or Labeling	Unknown	No impact		
DOE	Signal Decomposition for Intrusion Detection in Reliability Assessment in Cyber Resilience	Provides a straightforward framework wherein an anomaly detection algorithm can be trained on existing expected data and then used for false data injection detection.	Machine Learning (Type Unknown)	Monitoring or Detection	Planning or development stage	No impact		An advanced library for signal decomposition and analysis will be developed that allows combining machine learning and artificial intelligence algorithms and high-fidelity model comparisons for greatly improved false data injection detection. This library will facilitate online and posteriori analysis of digital signals for the purpose of detecting potential malicious tampering in physical processes.
DOE	Advanced Machine Learning-based Fifth Generation Network Attack Detection System	Proves that enhancing attack detection via innovative machine learning techniques into the fifth generation (5G) cellular network can help to secure mission-critical applications, such as automated vehicles and drones, connected health, emergency response operations, and other missioncritical devices.	Machine Learning (Type Unknown)	Security	Planning or development stage	Indirect impact		
DOE	Red Teaming Artificial Intelligence	Provides methods for the reverse engineering, exploitation, risk assessment and vulnerability remediation. The insights gained from the explorations into vulnerability assessment research will proactively address critical gaps in the cybersecurity community’s understanding of these systems.	Machine Learning (Type Unknown)	Security	Planning or development stage	Indirect impact		
DOE	Unattended Operation through Digital Twin Innovations	Predicts events using the integrated data from test bed sensors and physics-based models; produces a framework for future digital twins.	Unclear	Forecasting & Prediction	Unknown	No impact		
DOE	Secure and Resilient Machine Learning System for Detecting Fifth Generation (5G) Attacks including Zero-Day Attacks	Implements an advanced machine learning based 5G attack detection system that can achieve high classification speed with high accuracy (90% or greater) as well as address a vulnerability to zero-day attacks using field programmable gate array based deep autoencoders.	Machine Learning (Type Unknown)	Classification or Labeling	Unknown	No impact		90% accuracy against real zero-day attacks recorded by Amazon Web Services.
DOE	Automated Malware Analysis Via Dynamic Sandboxes	Allows for automated analysis, provides non-existing core capabilities to analyze industrial control system malware, and outputs to a format that is machine readable and an industry standard in sharing threat information.	Machine Learning (Type Unknown)	Research (Other)	Unknown	No impact		
DOE	Interdependent Infrastructure Systems Resilience Analysis for Enhanced Microreactor Power Grid Penetration	Quantifies key resilience elements across integrated energy systems and their vulnerabilities to threats and hazards. This includes the ability to accurately analyze and visualize a region’s critical infrastructure systems ability to sustain impacts, maintain critical functionality, recover from disruptive events.	Machine Learning (Type Unknown)	Mapping	Planning or development stage	Indirect impact		This advanced decision support capability can improve our understanding of these complex relationships and help predict the potential impacts that microreactors and distributed energy resources have on the reliability and resiliency of our energy systems.
DOE	Adaptive Fingerprinting of Control System Devices through Generative Adversarial Networks	Reduces manual labor and operational cost required for training an electromagnetic (EM)-based anomaly detection system for legacy industrial control systems devices and Industrial Internet of Things.	Unclear	Organization & Efficiency	Unknown	No impact		
DOE	Support Vector Analysis for Computational Risk Assessment, Decision-Making, and Vulnerability Discovery in Complex Systems	Combines a support vector machine and PRA software to auto-detect system design vulnerabilities and find previously unseen issues, reduce human error, and reduce human costs.	Support Vector Machines	Monitoring or Detection	Unknown	No impact		
DOE	Deep Reinforcement Learning and Decision Analytics for Integrated Energy Systems	Manages distributed or tightly coupled multi-agent systems utilizing deep neural networks for automatic system representation, modeling, and end-to-end learning.	Neural Networks	Mapping	Unknown	No impact		
DOE	Nuclear-Renewable-Storage Digital Twin: Enhancing Design, Dispatch, and Cyber Response of Integrated Energy Systems	Develops a learning-based and digital twin enabled modeling and simulation framework for economic and resilient real-time decision-making of physicsinformed integrated energy systems (IES) operation.	Unclear	Forecasting & Prediction	Unknown	Indirect impact		Learningbased algorithms will make real-time decisions upon detection of component contingencies caused by climate-induced or man-made extreme events, such as cyber-attacks or extreme weather, thereby mitigating their impacts through appropriate counter measures.
DOE	Automated Infrastructure & Dependency Detection via Satellite Imagery and Dependency Profiles	Produces innovative and stateof-the-art image processing results that advance abilities to secure and defend national critical infrastructure.	Automated Image Processing	Security	Unknown	Indirect impact		
DOE	Accelerated Nuclear Materials and Fuel Qualification by Adopting a First to Failure Approach	Physics-based multi-scale modeling was coupled with deep, recursive, and transfer learning approaches to accelerate nuclear materials research and qualification of highentropy alloys.	Neural Networks	Research (Other)	Unknown	No impact		
DOE	Evaluating thermal properties of advanced materials	Helps elucidate thermophysical properties of a material from a single laser flash measurement.	Machine Learning (Type Unknown)	Research (Other)	Unknown	No impact		The standard thermal diffusivity measurement technique laser flash is enhanced by modifying the traditional experimental set up and analyzing results with a machine learning based tool that includes a finite element model, a least-squares fitting algorithm and experimental data treatment algorithms. 
DOE	Spectral Observation Convolutional Neural Network	Analyzes collected radiation spectra using advanced, scalable deep learning by combining spectroscopic expertise with high performance computing.	Neural Networks	Research (Other)	Unknown	No impact		This method was trained, tested, and operated on the International Space Station’s Spaceborne Computer-2 supercomputer, returning zero errors over the course of 100 training hours.
DOE	Passive Strain Measurements for Experiments in Radiation Environments	Determines permanent strains induced by irradiation and extract critical parameters using modeling and simulation as well as machine learning algorithms.	Machine Learning (Type Unknown)	Research (Other)	Short-term project or study	No impact		
DOE	Machine Learning Interatomic Potentials for Radiation Damage and Physical Properties in Model Fluorite Systems	Studies the influence of radiation damage on physical properties of calcium fluoride and uranium dioxide.	Machine Learning (Type Unknown)	Research (Other)	Short-term project or study	No impact		The high throughput capability of this method will become an important combinatorial materials science tool for developing and qualifying new nuclear fuels.
DOE	Data-driven failure diagnosis and prognosis of solid-state ceramic membrane reactor under harsh conditions using deep learning technology with internal voltage sensors	Investigates in situ the effects of different components on the degradation behavior in a solid-state ceramic membrane reactor by embedding sensors that will collect current and impedance data during operation. 	Unclear	Research (Other)	Short-term project or study	No impact		
DOE	Tailoring the Properties of Multiphase Materials Through the Use of Correlative Microscopy and Machine Learning	Identifies and correlates the critical microstructural features in a multiphase alloy that exhibits high strength and fracture toughness.	Neural Networks	Research (Other)	Unknown	No impact		Experimental data will be used to train a convolutional neural network (CNN) in a semi-supervised environment to identify key microstructural features and correlate those features with the strength and toughness
DOE	Microstructurally-driven Framework for Optimization of In-core Materials	Enables reactor developers to quickly understand the complex linkage between alloy composition, thermomechanical processing, the resulting microstructure, and swelling and creep behavior.	Automated Image Processing	Research (Other)	Unknown	No impact		
DOI	Wildlife Underpass Camera Trap Image Classification, San Diego CA	This software system takes wildlife camera trap images as inputs and outputs the probability of the image belonging to user-specified taxonomic classes based on wildlife species present in each image. 	Neural Networks	Classification or Labeling	Planning or development stage	No impact		The process of humans reviewing, labeling, and QA/QCing labels is labor intensive, time consuming, and costly. Developing AI systems that can perform these tasks within an acceptable level of accuracy can reduce the costs in extracting tabular data from camera-based datasets and increase the volume of data for analysis.
DOI	Walrus Haulout Camera Trap Image Classification	Takes walrus haulout camera trap images as inputs and outputs the probability of the image containing walruses and various human disturbances (boats, aircraft, etc.). 	Neural Networks	Classification or Labeling	Planning or development stage	No impact		
DOI	ARMI Amphibian Species ID from Acoustic Data	Takes audio clips that have been converted to sonograms (images) and classify the species generating the vocalizations in the recordings; initial prototype project will attempt to develop models that can identify audio clips containing bullfrog vocalizations.	Neural Networks	Classification or Labeling	Planning or development stage	No impact		Reviewing audio recordings and identifying species vocalizations captured therein is time consuming and labor intensive. For these reasons, many recordings remain unprocessed, preventing valuable data from being available for analysis. 
DOI	Individual Mountain Lion ID from Camera Data	Takes pairs of mountain lion facial images and outputs the probability that the images come from the same individual mountain lion. This will allow researches to passively "mark" individuals and support population estimation analyses. 	Neural Networks	Monitoring or Detection	Planning or development stage	No impact		
DOI	Walrus Object Detection in Drone/Satelite Imagery	Inputs drone imagery and outputs bounding boxes for individual walruses.	Neural Networks	Monitoring or Detection	Planning or development stage	No impact		If successful, this will allow researchers with Alaska Science Center to count the numbers of walruses in drone imagery to support population research.
DOI	PRObability of Streamflow PERmanence	Incorporates sparse streamflow observation data representing wet or dry stream conditions and gridded hydroclimatic explanatory data to predict the annual probability of streamflow permanence at 30-m (PROSPER Pacific Northwest) or 10-m (PROSPER Upper Missouri) resolution. 	High-Power Computing	Forecasting & Prediction	Ongoing project (in production more than a year)	No impact		
DOI	Water Mission Area Drought Prediction Project	Predicts daily hydrologic drought using machine learning models calibrated on streamflow data (response) and meteorological forcing data. 	High-Power Computing	Forecasting & Prediction	Planning or development stage	No impact		
DOI	Water Mission Area Regional Drought Early Warning System	Predicts and forecasts daily hydrologic drought in the Colorado River Basin (CRB).	Decision Tree Analysis (ie Random Forest or Gradient-Boosting)	Forecasting & Prediction	Planning or development stage	No impact		Uses gridded meteorologic forcing data and daily streamflow data in the CRB to build random forest and neural networks (long-short term memory) to determine the best approach to predicting and forecasting hydrologic drought. The project is being developed on AWS and in cooperation with CHS. We are also using the USGS HPC systems.
DOI	AI system to recognize individual fish and disease	Recognizes individual fish and their disease status from images. Success of this effort could complement or replace traditional mark-recapture methods used for estimating abundance, survival, and movement, and this could greatly reduce costs to fisheries managers. 	Automated Image Processing	Monitoring or Detection	Ongoing project (in production less than a year)	No impact		
DOI	River Image SEnsing	Development of a reliable camera system for integration into the operational streamgage monitoring network of the USGS Water Mission Area; capable of producing time-series of surface water levels derived from still camera images using AI/ML modeling techniques.	Automated Image Processing	Monitoring or Detection	Ongoing project (in production less than a year)	No impact		
DOI	Estimating stream flow from images in headwaters	Measures how much water flows in small, ungaged stream networks using timelapse images captured by inexpensive and off-the-shelf cameras and provides a web-based platform for making the images, associated climate and other related data as well as the model itself easy to access and explore. 	Automated Image Processing	Monitoring or Detection	Ongoing project (in production less than a year)	No impact		
DOI	Economic valuation of fisheries in the Delaware River	Links existing hydrological flow data and models with trout population dynamic models, changes to fish catch, and the economic benefits of recreational fishing. 	Neural Networks	Monitoring or Detection	Ongoing project (in production less than six months)	No impact		
DOI	Stream physical habitat characterization in the Chesapeake Bay Watershed	Takes a large dataset of rapid habitat assessment data collected by multiple jurisdictions in the Chesapeake Bay Watershed, train a predictive model using those data, and use that model to predict stream habitat conditions for all unmeasured stream reaches in the region. 	Decision Tree Analysis (ie Random Forest or Gradient-Boosting)	Forecasting & Prediction	Ongoing project (in production less than a year)	No impact		
DOI	Deep Learning for Automated Detection and Classification of Waterfowl, Seabirds, and other Wildlife from Digital Aerial Imagery	Automates the detection of wildlife in aerial imagery and the taxonomic classification of wildlife from the binary detector. 	Automated Image Processing	Monitoring or Detection	Ongoing project (in production less than a year)	No impact		Uses Tallgrass to develop and train algorithms, BlackPearl/Caldera to store large image datasets, a hosted instance of a customized version of the Computer Vision Annotation Tool to gather manually annotated data, and a separate PostgreSQL database to store annotations and image metadata.
DOI	Prediction of Regolith Thickness in the Delaware River Basin	Uses observations of the depth to bedrock reported by private well drillers in the Delaware River Basin to map the thickness of the regolith layer. 	Decision Tree Analysis (ie Random Forest or Gradient-Boosting)	Mapping	Ongoing project (in production less than a year)	No impact		
DOI	ML-Mondays course on applications of deep learning to image analysis	A course in application of deep learning image segmentation, image classification, and object-in-image detection. 	Machine Learning (Type Unknown)	Research (Other)	Ongoing project (in production more than a year)	No impact		
DOI	Coast Train	Dataset of orthomosaic and satellite images of coastal, estuarine, and wetland environments and corresponding thematic label masks. 	Automated Image Processing	Classification or Labeling	Ongoing project (in production less than six months)	No impact		The data consist of spatial and time-series, and contains 1.2 billion labelled pixels, representing over 3.6 million hectares.
DOI	Seabird and Marine Mammal Surveys Near Potential Renewable Energy Sites Offshore Central and Southern California	Images output from the final model classified targets into seven categories: bird, dark bird, dark bird flying, light bird, fish, marine mammal, and other.  Next, reclassify model labels to the lowest taxonomic group possible. 	Automated Image Processing	Classification or Labeling	Short-term project or study	No impact		The Seabird Studies Team at the Western Ecological Research Center (WERC), with support from the Bureau of Ocean Energy Management (BOEM), completed aerial photographic surveys of the ocean off central and southern California between 2018-2021. Over 800,000 high resolution images of the ocean were collected, with the goal of extracting and counting marine birds and mammals contained within. Once low taxonomic reclassification is complete, we will generate maps of species distribution and abundance to inform BOEM’s planning in advance of potential offshore wind energy development along the California coast.
DOI	Fouling Identification Neural Network (FINN)	Predicts and detects sensor (sonde) fouling at USGS stream gages. 	Neural Networks	Monitoring or Detection	Ongoing project (in production more than a year)	No impact		
DOI	Mapping river bathymetry from remotely sensed data	Uses high frequency satellite images from the Planetscope constellation to estimate water depth in river channels.	Neural Networks	Forecasting & Prediction	Planning or development stage	No impact		The training data consist of field measurements of water depth collected as part of other USGS projects on five different rivers. The neural network regression method is implemented in MATLAB using the Deep Learning Toolbox.
DOI	Mapping benthic algae along the Buffalo National River from remotely sensed data	Uses orthophotos acquired from a manned, fixed-wing aircraft and multispectral images from two different satellites to map bottom-attached (benthic) algae along the Buffalo National River in northern Arkansas. 	Decision Tree Analysis (ie Random Forest or Gradient-Boosting)	Mapping	Planning or development stage	No impact		
DOI	Characterization of Sub-surface drainage (tile drains) from satellite imagery	Delineates tile drains in satellite imagery, providing a way to look at historical imagery and to use satellite data to maintain an up-to-date geospatial layer of tile drain extent in basins of interest.	Automated Image Processing	Mapping	Ongoing project (time unknown)	No impact		Uses panchromatic imagery that is processed using a UNet model that was trained on a library of panchromatic images on which visible tile-drain networks had been traced; uses a combination of python scripting that is encapsulated in a Jupyter notebook.
DOI	Waterfowl Lifehistory and Behavior Classification	Provides a highly accurate daily classification of waterfowl behavior into 8 life history states/movement patterns using hourly GPS relocations and, optionally, remotely sensed habitat data. 	Decision Tree Analysis (ie Random Forest or Gradient-Boosting)	Classification or Labeling	Planning or development stage	No impact		
DOI	Spot Elevation OCR from historical topo maps	Creates a database of summit spot elevations from the HTMC labeled for summits in CONUS.	Optical Character Recognition (or Text Extraction)	Classification or Labeling	Ongoing project (in production more than a year)	No impact		
DOI	TerrainFeatures detection and recognition	Uses DL tools to extract terrain features.	Neural Networks	Monitoring or Detection	Ongoing project (in production more than a year)	No impact		
DOI	The National Landcover database	Develops Landcover across all 50 states; includes HPC processes, cloud services, and local resources to create thematic and continuous field classifications. 	Machine Learning (Type Unknown)	Classification or Labeling	Ongoing project (time unknown)	No impact		These classifications serve as the base for users and federal agencies across the nation to provide wildlife habitat estimates, urban runoff estimates, population growth, etc. 
DOI	Artificial Intelligence for Environment & Sustainability (ARIES)	ARIES is a full-stack solution for integrated modelling, supporting the production, curation, linking and deployment of scientific artifacts such as datasets, data services, modular model components and distributed computational services. This design enables automation of a wide range of modeling tasks that would normally require human experts to perform.	Network Analysis (ie Bayesian, or Social Network)	Mapping	Ongoing project (in production more than a year)	No impact		ARIES is an international research project based at the Basque Centre for Climate Change (Bilbao, Spain), to which USGS has been a long-term collaborator. ARIES uses semantics and machine reasoning to enable AI-assisted multidisciplinary, integrated modeling of coupled human-natural systems.
DOI	Global Inland Fisheries Risk Index	Informs the relative influence of threats in the development of a global inland fisheries assessment using boosted regression trees to derive a spatially-explicit risk index of stressors.	Decision Tree Analysis (ie Random Forest or Gradient-Boosting)	Mapping	Ongoing project (in production more than a year)	No impact		
DOI	Fish and Climate Change Database (FiCli)	Automates certain portions of the review process to increase efficiency in maintaining and updating the database.	Natural Language Processing	Organization & Efficiency	Ongoing project (in production less than a year)	No impact		The Fish and Climate Change Database (FiCli) is a comprehensive database of peer-reviewed literature compiled through an extensive, systematic primary literature review to identify English-language, peer-reviewed journal publications with projected and documented examples of climate change impacts on inland fishes globally.
DOI	Evaluating fish movement in restored coastal wetlands using imaging sonar and machine learning models	Wetland managers are restoring coastal wetland habitats in the Great Lakes, and often seek more information on when and how fish access restored habitats. Terabytes of hydroacoustic data on fish movement need to be analyzed more efficiently, so a collaboration between USGS, USFWS, and the University of Michigan is developing a machine learning model (MLM) that identifies, tracks, and quantifies fish movement. 	Neural Networks	Monitoring or Detection	Ongoing project (in production less than a year)	No impact		The completed model will read proprietary sonar image files, convert them to a universal file format (i.e., .mp4), place bounding boxes around individual fish detected by the model, and track them across consecutive image frames to determine bi-directional movement. The model uses training data and TensorFlow-based convolutional neural networks for object detection.
DOI	Fluvial Fish Native Distributions for the Conterminous United States using the NHDPlusV2.1 and Boosted Regression Tree (BRT) Models	Develops species distribution models for 271 fluvial fish species in their native ranges of the conterminous United States. 	Decision Tree Analysis (ie Random Forest or Gradient-Boosting)	Mapping	Ongoing project (in production less than six months)	No impact		
DOI	Prediction of Inland Salinity in the Delaware River Basin	Inputs watershed characteristics (soils, land cover), land use (road salt application) and meteorological timeseries, and output predictions of specific conductance (SC) for inland stream reaches in the Delaware River Basin (DRB). 	Neural Networks	Forecasting & Prediction	Planning or development stage	No impact		The model will be trained using SC sample data from within the DRB. The resulting model will allow for predictions in ungaged locations and time periods, and allow for an evaluation of salinity exposure in these stream reaches. The model will be built using pyTorch on the USGS Tallgrass supercomputer.
DOI	Prediction of Salt Front Location in the Delaware River Estuary	Makes predictions of the 250 mg/L isochlor (salt front location) within the Delaware River Estuary. The model will be driven by river discharge into the estuary, tidal forcings, and meterological data from several points throughout the estuary. 	Neural Networks	Forecasting & Prediction	Planning or development stage	No impact		Model predictions will be compared with a process-based, hydrodynamic model, COAWST.
DOI	Prediction of Water Temperature in the Delaware River Basin	Makes water temperature predictions at 456 reaches in the Delaware River Basin. 	Neural Networks	Forecasting & Prediction	Ongoing project (in production more than a year)	No impact		The recurrent graph convolutional network (RGCN) was pre-trained with predictions from a coupled process-based model that predicts stream flow and temperature.
DOI	Forecasting Water Temperature in the Delaware River Basin	Produces 7-day forecasts of daily maximum stream water temperature downstream of drinking water reservoirs in support of water management decisions.  	Neural Networks	Forecasting & Prediction	Ongoing project (in production less than a year)	Indirect impact		Our process-guided deep learning model was pretrained on output from an integrated stream-reservoir process-based model and used an autoregressive technique and data assimilation to ingest real-time observations of stream temperature to improve near-term forecasts.
DOI	Prediction of Flood Flow Metrics for Minimally Altered Catchments	Inputs watershed characteristics (soils, land cover) and long-term meteorological data, and outputs predictions of flood flow metrics (magnitude, duration, frequency, volume) for stream reaches. 	Machine Learning (Type Unknown)	Forecasting & Prediction	Planning or development stage	No impact		The resulting models will allow for estimating flood flow metrics in ungaged reaches, which can be used to inform infrastructure designs along those reaches (e.g., bridges).
DOI	Process-Guided Deep Learning Predictions of Lake Water Temperature	Predicts depth-specific lake temperatures while obeying physical laws using inputs of meteorological drivers. 	Neural Networks	Forecasting & Prediction	Ongoing project (in production more than a year)	No impact		
DOI	Prediction of Lake Water Temperature using Lake Attributes	Inputs lake characteristics (surface area, elevation, and others to be determined) and outputs predictions of depth-specific lake temperatures. 	Neural Networks	Forecasting & Prediction	Planning or development stage	No impact		The models will be developed using various Python packages including PyTorch on the USGS Tallgrass supercomputer.
DOI	Process-Guided Deep Learning for Dissolved Oxygen Predictions on Stream Networks	Predicts daily minimum, mean, and maximum dissolved oxygen (DO) concentrations at several stream locations in the Delaware River Basin. 	Neural Networks	Forecasting & Prediction	Planning or development stage	No impact		The deep learning models were written via TensorFlow, the data prepartion is in R, and the modeling workflow was scripted via Snakemake.
DOI	Multi-task deep learning for daily streamflow and water temperature	Predicts two interdependent variables, daily average streamflow and daily average stream water temperature, together using multi-task deep learning. 	Neural Networks	Forecasting & Prediction	Planning or development stage	No impact		The stream temperature data were collected by the USGS and made available via NWIS. The streamflow observations were also collected by the USGS but collated along with input drivers in the CAMELS dataset. 3) This work was done using Python. The deep learning models were written via TensorFlow and the modeling workflow was scripted via Snakemake.
DOI	Predicting Water Temperature Dynamics of Unmonitored Lakes With Meta‐Transfer Learning	Compares the transfer of different model types from well-observed to unobserved lake systems. 	Neural Networks	Forecasting & Prediction	Ongoing project (in production more than a year)	No impact		Process-based models, neural networks, and process-guided neural networks are trained on well observed lakes (source lakes) and then is used to make predictions in unobserved lakes (target lakes).
DOI	Process-guided deep learning for predicting stream temperature in out-of-bound conditions	Predicts network wide daily average stream temperature in the Delaware River Basin; compares the performance of two deep learning achictectures, both of which incorporate process guidance through pretraining on process-based modelling outputs.	Neural Networks	Forecasting & Prediction	Planning or development stage	No impact		
DOI	Process guidance for learning groundwater influence on stream temperature predictions	Predicts network wide daily average stream temperature in the Delaware River Basin; focuses on developing a custom loss function that helps deep learning models learn to account for groundwater influence on stream temperature. 	Neural Networks	Forecasting & Prediction	Planning or development stage	No impact		
DOI	Explainable AI and interpretable machine learning	Develops expertise and resources for Explainable AI (XAI) within WMA PUMP Projects. The inputs are various models developed for predicting stream temperature, discharge, dissolved oxygen, and other characteristics. The outputs are interpretable metrics to help understand why models are making the predictions they are and what physical processes are getting captured with the model architectures.	Decision Tree Analysis (ie Random Forest or Gradient-Boosting)	Research (Other)	Planning or development stage	No impact		
DOI	AI applications to mapping surface water	Investigates the use of hand annotated hydrography from one region to train an artificial neural net (ANN) to identify where surface water is likely to be in other areas.	Neural Networks	Monitoring or Detection	Planning or development stage	No impact		
DOI	Where’s the Rock: Using Neural Networks to Improve Land Cover Classification	Differentiates exposed bare rock (rock) from soil cover (other) in order to classify bare rock in NAIP orthoimagery, starting with the Sierras, in order to provide a more accurate map of soil vs. rock-covered areas for use in landslide hazard mapping, quantifying soil carbon storage, calculating water fluxes, etc.	Neural Networks	Classification or Labeling	Ongoing project (in production more than a year)	No impact		
DOI	Data–driven prospectivity modelling of sediment–hosted Zn–Pb mineral systems and their critical raw materials	Produces a prospecticvity map for Clastic Dominated and Mississippi Valley Type deposits in the three countries.	Decision Tree Analysis (ie Random Forest or Gradient-Boosting)	Mapping	Ongoing project (in production less than a year)	No impact		
DOI	Updating Real-time Earthquake Shaking, Ground Failure, and Impact products with remote sensing and ground truth observations	Enables accurate and high-resolution multi-hazard and damage estimates by jointly inferring shaking and secondary hazards and resulting building damage and quantifying their causal dependencies from imagery and prior loss and GF models. 	Network Analysis (ie Bayesian, or Social Network)	Forecasting & Prediction	Planning or development stage	No impact		The underlying physical causal dependencies are modeled using a multi-layer causal Bayesian network. Initial results are impressive, showing that our framework significantly improves the GF prediction abilities.
DOI	Using Artificial Neural Networks to Improve Earthquake Ground-Motion Models	Provides estimates of peak ground-motion from earthquakes given the location, magnitude, and local geological structure at a site of interest.	Neural Networks	Forecasting & Prediction	Planning or development stage	No impact		
DOI	Leveraging Deep Learning to Improve Earthquake Monitoring	Characterizes earthquake source information using small portions of waveform data to improve autotmatic phase picking, classify phase types, and estimate source-station distances.	Neural Networks	Monitoring or Detection	Ongoing project (in production less than a year)	No impact		
DOI	Using Gradient Boosting Method and Feature Selection to Reduce Aleatory Uncertainty of Earthquake Ground-Motion Models	Develops ground-motion models for peak ground acceleration and peak ground velocity using a gradient boosting method (GBM).	Decision Tree Analysis (ie Random Forest or Gradient-Boosting)	Mapping	Planning or development stage	Indirect impact		In total 128 GBM-based ground-motion models are developed for estimating PGA and PGV, respectively, using varying subsets of explanatory variables.
DOI	Application of machine learning to ground motion-based earthquake early warning	Predicts what the earthquake peak ground shaking will be across a region. 	Machine Learning (Type Unknown)	Forecasting & Prediction	Planning or development stage	Indirect impact		
DOI	A machine learning approach to developing ground motion models from simulated ground motions	Build a ground motion model (GMM) from a synthetic database of ground motions extracted from the Southern California CyberShake study. An artificial neural network is used to find the optimal weights that best fit the target data (without overfitting), with input parameters chosen to match that of state-of-the-art GMMs. 	Neural Networks	Forecasting & Prediction	Ongoing project (in production more than a year)	No impact		
DOI	Integrating machine learning phase pickers into the Southern California Seismic Network earthquake catalog	Evaluates the readiness of machine-learning models for automatic earthquake detection and phase picking to enhance the Southern California Seismic Network earthquake catalog, with the end-goal of using these models in routine seismic network operations. 	Neural Networks	Monitoring or Detection	Planning or development stage	No impact		
DOI	Understanding the 2020-2021 Puerto Rico Earthquake sequence with deep learning approaches	Enhances the earthquake catalog for the 2020-2021 southwestern Puerto Rico earthquake sequence with a variety of deep learning approaches to understand its complex fault system, triggering mechanisms, and long-lived vigorous nature of the aftershock sequence. 	Neural Networks	Research (Other)	Short-term project or study	No impact		
DOI	Land Use Plan Document and Data Mining and Analysis R&D	Explores the potential to identify patterns, rule alignment or conflicts, discovery, and mapping of geo history and/or rules. Inputs included unstructured planning documents. Outputs identify conflicts in resource management planning rules with proposed action locations requiring exclusion, restrictions, or stipluations as defined in the planning documents.	Natural Language Processing	Research (Other)	Planning or development stage	No impact		
DOI	Data Driven Sub-Seasonal Forecasting of Temperature and Precipitation	Deployed data driven methods for sub-seasonal (2-6 weeks into future) prediction of temperature and precipitation across the western US. 	Decision Tree Analysis (ie Random Forest or Gradient-Boosting)	Forecasting & Prediction	Planning or development stage	No impact		Improving sub-seasonal forecasts has significant potential to enhance water management outcomes.
DOI	Data Driven Streamflow Forecasting	A year-long evaluation of existing 10-day streamflow foreasting technologies and a companion prize competition open to the public, also focused on 10-day streamflow forecasts. Forecasts were issued every day for a year and verified agains observed flows.	Neural Networks	Forecasting & Prediction	Planning or development stage	No impact		Across locations and metrics, the top perfoming foreacst product was a private, AI/ML forecasting company - UpstreamTech. Several competitors from the prize competition also performed strongly; outperforming benchmark forecasts from NOAA. Reclamation is working to further evaluate the UpstreamTech forecast products and also the top performers from the prize competition.
DOI	Seasonal/Temporary Wetland/Floodplain Delineation using Remote Sensing and Deep Learning	Provides improved seasonal/temporary wetland/floodplain delineation when high temporal and spatial resolution remote sensing data is available to inform the management of protected species and provide critical information to decision-makers during scenario analysis for operations and planning.	Neural Networks	Mapping	Short-term project or study	No impact		
DOI	Improving UAS-derived photogrammetric data and analysis accuracy and confidence for high-resolution data sets using artificial intelligence and machine learning	UAS derived photogrammetric products contain a large amount of potential information that can be less accurate than required for analysis and time consuming to analyze manually; apply machine learning to better analyze photogrammetric products. 	Machine Learning (Type Unknown)	Research (Other)	Planning or development stage	No impact		
DOI	Photogrammetric Data Set Crack Mapping Technology Search 	Explores a specific application of photogrammetric products to process analysis of crack mapping on Reclamation facilites. 	Unclear	Mapping	Planning or development stage	No impact		
DOI	Improved Processing and Analysis of Test and Operating Data from Rotating Machines	This project is exploring a better method to analyze DC ramp test data from rotating machines. Previous DC ramp test analysis requires engineering expertise to recognize characteristic curves from DC ramp test plots. The ramp test plots can be analyzed by computer software, rather than manual engineering analysis, to recognize characteristic curves. 	Regression Analysis	Research (Other)	Planning or development stage	No impact		
DOI	Sustained Casing Pressure Identification	Quickly identify wells with sustained casing pressures to mitigate accidents on well platforms. 	Neural Networks	Monitoring or Detection	Planning or development stage	No impact		
DOI	Level 1 Report Corrosion Level Classification	Automated screening system that can identify parts of wells that exhibit excess corrosion to greatly reduce report processing time.	Neural Networks	Organization & Efficiency	Planning or development stage	No impact		
DOI	Well Activity Report Classification	Researches use of self-supervised deep neural networks to identify classification systems for significant well event using data from well Activity Reports.	Neural Networks	Research (Other)	Planning or development stage	No impact