Measuring documentation completeness in multiple languages
US11620127B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | May 11, 2021 |
| Grant date | Apr 4, 2023 |
| Priority date | — |
| Expiry date | Sep 4, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V10/454
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Source code is analyzed to identify components. The components are each assigned a complexity score. Documentation for the source code is identified, related to the components, and given a score based on the quantity of the documentation for the component and the complexity score for the component. To determine semantic meaning of the documentation, vector embeddings for the documentation languages may be generated and aligned. Alignment causes the different machine learning models to generate similar vectors for semantically similar words in the different languages. Since the vectors of the words of the other languages are similar to the vectors of the words in a primary language with similar meanings, the vector representation of the documentation in the other languages will match the vector representation of the source code when the documentation is substantially on the same topic.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.