The Capstone Experience provides the educational capstone for all students majoring in computer science at Michigan State University. Teams of students build software projects for corporate clients. For information on becoming a project sponsor, see Project Sponsorship or contact Dr. Wayne Dyksen. The following were the project sponsors and projects for Spring 2020:
Amazon: Amazon Data Hub
Headquartered in Seattle, Amazon is the world’s largest online retailer and is also the world’s largest cloud services provider with their Amazon Web Services (AWS) products.
As a leader in the technology sector, Amazon has access to massive amounts of data. They employ teams of data scientists to analyze this data to improve Amazon’s various offerings, including their product recommendations.
The task of finding the best dataset for a problem is time- consuming and requires significant manual work, including looking through thousands of individual files that are stored in many different locations. This process takes up a substantial amount of time that could be better used for development.
Our Amazon Data Hub software streamlines dataset acquisition with an easy-to-use website that allows data scientists to automatically search through Amazon’s collection of data.
When an Amazon data scientist uploads a dataset to our Amazon Data Hub repository, it undergoes automated analysis. This includes object detection and speech recognition for images, videos and audio, as well as statistical analysis of numerical data.
Data scientists use the web application to search through our catalog of datasets. Search results include information provided when the dataset was uploaded, as well as information from our automated analysis. Intuitive visualizations of each dataset allow users to quickly evaluate the relevance of each dataset.
The Amazon Data Hub decreases the time it takes to find suitable datasets from hours to minutes, allowing data scientists to spend their time on more important work.
Our system uses AWS’s scalable products, including S3, DynamoDB, Rekognition, Transcribe, Lambda, Elastic MapReduce, and Elasticsearch, to store, process and search the datasets. Python Flask is used to connect our back end with our ReactJS front end.
AppDynamics: Segmented Data Anomaly Detection
AppDynamics, headquartered in San Francisco, provides a leading application performance management (APM) platform, which is used by corporations around the world to monitor the performance of their software systems.
Application owners and developers use the BizIQ feature of the APM to quickly correlate business consequences with application performance.
For example, imagine that users with Acme credit cards and hyphenated surnames are experiencing lengthy response times while making purchases on an e-commerce store. Lower customer satisfaction rates ensue, leading to quantifiable revenue risk.
BizIQ monitors this software issue, investigates the root causes of the performance bottlenecks, and delivers actionable insights. However, BizIQ is currently unable to automatically recognize unique combinations of factors, such as Acme users with hyphenated surnames that are causing issues.
Segmented Data Anomaly Detection utilizes the copious amounts of customer data collected by the APM to improve the diagnostic aspect of BizIQ with machine learning.
Leveraging cluster analysis and unsupervised machine learning, anomalies are explored across hundreds of performance metrics. This leads to the discovery of specific combinations of factors that cause performance issues.
Automating this diagnosis in parallel with data collection saves time and determines the root cause of an issue more accurately.
Segmented Data Anomaly Detection uses Node.js to pull data from the APM, and scikit-learn running on Python to perform data analysis. The results of the analysis are rendered on a web app, which will be developed using JavaScript and includes cluster visualizations powered by D3.js.
Auto-Owners Insurance: Phish Phinder
Headquartered in Lansing, Michigan, Auto-Owners Insurance is a Fortune 500 company that is represented by over 47,000 licensed insurance agents across 26 states. Auto-Owners provides automotive, home, life and business insurance to nearly 3 million policyholders.
Every day, associates at companies like Auto-Owners receive phishing emails that attempt to obtain sensitive personal and company information. Educational awareness programs, while common, do not protect a company against all phishing attempts and lead to extremely cautious employees. As a result, cyber security departments are flooded with emails forwarded to them by concerned associates.
Our Phish Phinder is an Outlook add-in which automates the phishing detection process for wary professionals. When a user sees a suspicious email and clicks the add-in button, our software scans the email and returns a categorization and confidence score. In an Outlook sidebar, the email is categorized as a confirmed phishing attempt, a suspected phishing attempt, spam or harmless.
The user is also given an educational tutorial detailing and explaining the suspicious parts of the email. This gives associates a method to better understand the characteristics of spam and phishing attempts.
The email data gathered by Phish Phinder is visible to executives and administrators in an analytics dashboard, and the emails themselves are available for review within a webpage. This allows companies to keep track of phishing targets in a streamlined manner.
The technologies involved in the Phish Phinder back end include Azure SQL, Python Flask API and Azure Web Services. The front end incorporates an Angular framework for the webpages and CSS, HTML and JavaScript for the Outlook add-in.
Bosch: Classifying Target Vehicles for Adaptive Cruise Control
Bosch is a global engineering and technology company with products sold in 150 countries worldwide. Founded in Germany in 1886, Bosch is the world’s leading supplier of automotive components.
Bosch’s adaptive cruise control is an advanced driver assistance system that allows a vehicle to automatically change its speed based on traffic conditions. Using software that processes radar data and video footage from the vehicle, the behavior of surrounding vehicles is labeled.
For example, if the system determines that a car is cutting into the lane directly in front of the host vehicle, it will identify and label the new vehicle, and intelligently adjust its pace in real time.
Currently, Bosch employees determine the accuracy of the adaptive cruise control software by manually labeling video files and comparing them to the behavior of the vehicle. While necessary, this labeling process is costly and difficult because Bosch collects thousands of hours of video footage.
Classifying Target Vehicles for Adaptive Cruise Control is a tool that automates the label generation process. Using machine learning, video is analyzed to detect lane lines and surrounding vehicles. Then, a combination of statistical logic and machine learning labels the environment in a time-series fashion. Each label is assigned a confidence rating, allowing Bosch employees to easily identify and fix incorrect labels.
This tool significantly reduces the time and effort required to manually label testing videos.
Our software is deployed to both Windows and Linux. The user interface is built with PyQt. The YOLOv3 algorithm is used to recognize vehicles, and ERFNet for lane line detection. A combination of machine learning and logic is used to compute the labels.
The Dow Chemical Company: Manufacturing Avatar Plant Twin (MAPT)
Headquartered in Midland, Michigan, Dow is a global leader in specialty chemicals, advanced materials, and plastics. Dow provides a world-class portfolio of advanced, sustainable, and leading-edge products.
Working with chemical products requires extreme precision to ensure the safety of all involved. This necessitates the need for precise equipment location and tracking records. Currently, Dow’s technical experts manually complete these monotonous, non-uniform reports. With plants in 160 countries, it is increasingly difficult to coordinate this information.
Our Manufacturing Avatar Plant Twin (MAPT) system provides Dow’s experts with the simple and precise tools needed to report accurate equipment locations and build a centralized database with up-to-date information.
Our system streamlines the sensor assignment process for different pieces of equipment at Dow plants. Using our web application, a user analyzes assets such as pumps, compressors and furnaces, then reports the locations of sensors attached to these pieces of equipment.
Once the user is finished reporting sensor locations, the information is propagated to the database, where it is compared with other reports assigned to the same asset. Discrepancies and errors are flagged in the background.
To aid in the reporting process, machine learning is used to suggest potential layouts to the user for new assets, based on trends in previously submitted data.
Our web application is built using the Microsoft Azure Cloud Computing Platform. The user interface runs on CSS, HTML, and JavaScript. All the records are stored in an SQL database that is managed and implemented with C#. The Manufacturing Avatar Plant Twin supports desktop and mobile browsers.
Evolutio: ERP Air Force: Conservation Threat Detection
Evolutio is a group of technology professionals convinced that business problems have significantly simpler solutions than the market is led to believe. These solutions span across the globe, including the non-profit Elephants, Rhinos, and People (ERP), a group founded to preserve and protect Southern Africa’s wild elephants and rhinos.
As part of their initiative to preserve and protect wildlife, ERP uses drones, or unmanned aerial vehicles (UAVs), to monitor elephants at the Rietvlei Reserve in South Africa.
Wildlife is threatened every day by not only poachers, but also by the destruction of food sources, the disruption of habitat by tourists and natural threats such as floods, wildfires, and drought. In a 400,000-acre park, it is impossible to detect and monitor threats without an automated system.
Our Conservation Threat Detection system serves two primary functions: auto-identify threats in drone footage and inform rangers of these identified threats in real time.
ERP pilots fly drones equipped with cameras throughout the reserve and our system automatically detects any threats, including cars, humans, fires, and floods, from the camera feed in real time.
If a threat is detected, nearby rangers are informed of the threat and its location through a graphical user interface (shown on the right), together with silent notifications conveyed through vibration motors mounted in our custom-designed ranger vest.
Our system allows ERP to monitor large areas of land in real time without the need for ERP personnel to manually analyze hundreds of hours of drone video footage. This allows ERP to more quickly respond to imminent threats.
Our threat detection is done using neural networks built with TensorFlow. All components of the system communicate through Ethernet protocol and the main system runs on a Jetson Nano.
Ford Motor Company: Ford Augmented Reality Owner’s Manual
Ford Motor Company is a multinational automotive manufacturer headquartered in Dearborn, Michigan, employing 199,000 employees and producing a total of 5.9 million vehicles in the last recorded year. Ford designs and manufactures a full line of cars, trucks, SUVs and electric vehicles under both the Ford and Lincoln brands.
Every Ford vehicle comes with a printed owner’s manual containing more than 300 pages of basic information pertaining to the operation and maintenance of the vehicle. This manual is cumbersome, difficult to navigate, and has not evolved with the technology inside the vehicles.
Our Augmented Reality Owner’s Manual application provides an intuitive and accessible digital version of the owner’s manual with augmented reality (AR) capabilities.
The interior of the vehicle is displayed from the driver’s perspective using the phone’s camera. From this screen, interactive digital content is overlaid using AR, enabling users to quickly access resources.
When a user clicks on an interactive component of the augmented reality display, a list of relevant content is displayed. This content includes a digital version of the corresponding owner’s manual section, tutorial videos and answers to frequently asked questions. Alternatively, the same content is accessible through the search bar from the app’s homepage.
Authorized Ford employees create, edit and delete vehicles and manage any associated content through the web application. This content is accessed via the iOS app for the respective vehicle.
Our iOS application leverages Swift and ARKit to provide an AR experience. The web application is built using the ReactJS framework. The web application and iOS application are linked through an API and database hosted by Amazon Web Services.
General Motors: Open Source Intel
General Motors (GM) is a multinational automotive manufacturer headquartered in Detroit, Michigan. GM is ranked #13 on the Fortune 500 for total revenue and is the largest auto manufacturer headquartered in the United States.
Maintaining strong information security is a priority for GM to protect sensitive information that could compromise asset security and communication privacy. Publicly visible credentials grant unauthorized parties the opportunity to infiltrate GM assets and view private communication networks.
Our Open Source Intel system automates the discovery of security threats by collecting and analyzing information from various public code repositories on the internet such as GitHub, GitLab and Bitbucket.
Confidential intellectual property (IP) such as GM usernames, API keys and code snippets are displayed on a user-friendly web application. When a threat is discovered, relevant information about the IP leak is displayed so that GM employees can quickly act to mitigate the threat.
A machine learning service gives each discovered leak a confidence score. If a threat is assigned a high enough confidence score, employees are notified via text message and/or email.
Open Source Intel automates the currently manual process of discovering the warning signs of a leak and drastically increases employee effectiveness by letting them focus on threat mitigation instead of threat discovery.
The Python data collection pipeline is orchestrated using Celery, pipeline data is stored temporarily in Redis, and code is processed using PyDriller. A trained scikit-learn machine learning model quantifies each hit discovered by the pipeline. Open Source Intel stores data in a PostgreSQL database. This database then feeds the Python Django web application for display.