Project Requirements
Executive Summary
PyMigBench is a benchmark of Python Library Migration data that is developed in the following paper. PyMigBenchWeb will be a webportal to make the benchmark available through web and also to provide other general information about the project. The website will be used by researchers and software engineering practitioners to browse the PyMigBench benchmark and also to know more about the general information about the project.
Project Glossary
- BenchMark - Collection of python library migration data collected in PyMigBench.
- Library Cluster - Based on PyMigBench, all library pairs that are related are clustered togeather.
- Function/Library Map - Based on PyMigBench, each function/library paired to their migrated functions/libaries. Highlighting them using Sankey Diagram.
- Result - A result is the returned result from a search query on PyMigBench experiments.
- Tableform - A result returned from a search query on PyMigBench is displayed in tableform.
User Stories
US 1.01 - Query template
As a User, I want to query the database using a query template, so that I do not have to enter the whole query from scratch (5)
Acceptance Tests
1. User can learn about query format and keywords by using the query builder to construct queries
US 1.02 - SQL Compliant query in text box
As a User, I want to be able to write a SQL Compliant query in a text box, instead of using the template, so that it is easier for me to copy the query from another source or paste it somewhere else. (8)
Acceptance Tests
1. User can enter text into a textbox
2. User can input text that is allowed and actions get carried out
US 1.04 - Visualize Library pair mapping
As a User, I want to visualize the mapping of PyMigBench benchmark library pairs, so that I can view my options at a glance. (5)
Acceptance Tests
1. User can open webpage and view mapping of different libraries
2. User can select the connections between libraries that can be migrated
Acceptance Tests
2. User can see the library pairs and all the examples that they can migrate
US 2.01 - Click the commit/file path in result
As a User, I want to be able to click on the commit or file path in the result, if they are returned, so that I can see the commit or file in question. (3)
Acceptance Tests
1. User can view all commit/file path in the result, based on their query.
2. User can click on commit/file path in the result, based on their query.
3. User cannot view commit/file path in the result, if there are no matching result for the query they entered.
US 2.02 - SQL Compliant query highlight syntax
As a User, I want to have highlighting as I am writing the SQL Compliant query, so that it is easier for me to debug my query. (8)
Acceptance Tests
1. User can view their keywords highlighed in their query, as they enter in SQL Compliant query.
2. User can view the keywords highlighing update when they modify their query.
US 2.03 - Dynamic results table
As a User, I want to see a table of results including different data types and properties from the query result, so that I can compare details library/function pairs.(5)
Acceptance Tests
1. User can view their data in tabulated form.
2. User can view all the data types and properties in table of results for entered query.
US 2.04 - Syntax checker for SQL Compliant query
As a User, I want to have syntax checking as I am writing the SQL Compliant query, so that it is easier for me to debug my query. (8)
Acceptance Tests
1. User can view any syntax errors in their query, when they are writing in SQL Compliant query.
2. User can make modificaitons to the query based on errors in their query.
US 2.05 - View code changes
As a User, I want to see examples of code changes that have taken place from a library migration as I would view diff files in vscode. (8)
Acceptance Tests
1. User can view the examples of code changes that have taken place from a libraray migration pair.
US 3.01 - Auto-complete option for SQL Compliant query
As a User, I want to have options to auto-complete as I am writing the SQL Compliant query, so that it is easier for me to write the query.(8)
Acceptance Tests
1. User can view suggestion for query as they are writing their query in SQL Compliant query.
2. User can select the suggested query.
US 3.02 - View PyMigBench documentation
As a User, I want to view the PyMigBench documentation and research work, so that I can know more about the project.(2)
Acceptance Tests
1. User can find and view the documentation and research work for PyMigBench.
US 4.01 - Import YAML files to the database
As a User, I want to be able to import existing YAML files to the database using ssh, so that I can keep the database up to date. (5)
Acceptance Tests
1. User can ssh into the database
2. User can import new YAML files, updating the database
US 4.04 - Normalized data
As a User, I want the data to be normalized, so that I do not see duplicate data when using the system.(5)
Acceptance Tests
1. User can only see one instance of data
2. User wont see duplicate data
US 4.03 - Export dataset as YAML files
As a User, I want to be able to export dataset from the database as a YAML file.(8)
Acceptance Tests
1. User can ssh into the database
2. User can access the database, exporting YAML format files out of the database
US 5.01 - View general and contact information
As a User, I want to know the general and contact information about the PyMigBench, so that I can contact the main person for the project. (1)
Acceptance Tests
1. User can navagate the webpage to find the contact info about Pymigbench
2. User can then reach out to the creater of the project
US 5.02 - Dynamic Query Builder
As a User, I want to be able to view the properties of only selected dataset, so that I am not overwhelmed with a huge list of poperties. (1)
Acceptance Tests
1. User sees list of poperties based on the dataset that they have selected in the query builder
MoSCoW
Must Have
- US 1.01 - Query Template
- US 1.02 - SQL Compliant query in text box
- US 1.04 - Visualize Library pair mapping
- US 2.01 - Click the commit/file path in result
- US 2.03 - Dynamic result table
- US 2.05 - View code changes
- US 4.04 - Normalized data
- US 4.03 - Export dataset as YAML files
- US 5.02 - Dynamic Query Builder
Sould Have
- US 2.04 - Syntax checker for SQL Compliant query
- US 3.02 - View PyMigBench documentation
Could Have
- US 2.02 - SQL Compliant query highlight syntax
- US 5.01 - View general and contact information
Would like but Won't get
- US 4.01 - Import YAML files to the database
- US 3.01 - Auto-complete option for SQL Compliant query
Similar Products
-
Contains library and function migration benchmarks for Java.
Open-source Projects
- Superset
- is a data visualization and data exploration platform. It has a dashboard with many graph templates, including Sankey and force-directed diagrams for the user to choose from. It also provides the database connection feature. The user can use SQL query to choose the visualized data, which is very similar to our project. From this project, we can get inspiration about database connection, SQL query parse, graph implementation and UI design.
- D3.js
- is a JavaScript library for manipulating graphs based on data. It can help bring data to life using HTML, SVG, and CSS. It requires much effort in drawing graphs, but it can to create complex graphs with high flexibility.
- plot.js
- is an open-source JavaScript charting library which packages many pre-defined functionalities using D3.js. It reduces the effort of implementing different graphs but it may lack the complex graphs, as well as the flexibility to customize the graph. Since it provides a very simple Sankey template, we can use it directly if that meets the requirement.
- sql-lint
- is a tool checking SQL sanity in the environment of JavaScript. Since we may need syntax and highlight in frontend, this open-souce project can reduce difficulty when check the SQL statement errors.
- python-sqlparse
- is a non-validating SQL parser for Python. It provides support for parsing, splitting and formatting SQL statements. With this library, we can easily parse the user's SQL statements and do queries in backend.