Microsoft claims that the machine learning model is created for software developers can differentiate between security and non-security bugs 99% of the time.
By combining the system with human security professionals, Microsoft stated that it could create an algorithm that doesn’t only have the ability to identify security bugs with virtually 100% correctness accurately, but also correctly signal vital, high priority bugs 97% of the time.
In the coming months, The company is intending to open-source its methods, processes, rules, and procedures on GitHub.
Based on Microsoft, their workforce of about 47,000 developers generate about 30,000 bugs every month throughout its AzureDevOps and GitHub silos, by majorly causing headaches for the security teams whose tasks it is to guarantee vital security weakness doesn’t go absence.
Although tools that automatically signal and prioritizing bugs are available, on certain occasions, false-positives are tagged, or viruses are categorized as low-impact problems when they are more brutal.
In other to correct this problem, Microsoft intends to work on creating a machine learning model that has the ability of both categorizing bugs as security or non-security issues and also identifying critical and non-critical bugs with a level of correctness that is close to the extent of being possible to that of a security professional.
This initially involved feeding the model training data that security professionals had already accepted, according to the statistical sampling of security and non-security bugs.
Immediately the production model had been taken, Microsoft plans in setting up the programming of a two-step learning model that would allow the algorithm to learn how to differentiate between security bugs and non-security bugs, and then designate labels to bugs pointing out whether they were low-impact, essential or vital.
Primarily, the security professionals are involved with the production model throughout the entire stage of the journey, reviewing and approving data to confirm that the labels were accurate; choosing, training and evaluating modeling practices; and manually reviewing typical samples of bugs to assess the correctness of the algorithm.
Scott Christiansen made an explanation, Senior Security Program Manager at Microsoft and Mayana Pereira, Microsoft Data and Applied Scientist, that the model was automatically retrained with the recent data to it maintained pace with Microsoft’s internal production cycle.
They further explained that a security professional is still accepting the data before the model has been retrained, and we always monitor the number of bugs generated in production.
By making use of machine learning in our data, we categorize which work items are security bugs 99 percent of the time correctly. The model is also 97% correct at tagging critical and non-critical security bugs.
This level of correctness gives us absolute confidence that we are discovering lots of security weaknesses before they are being exploited.