tab
tableqa
pypi i tableqa
tab

tableqa

AI Tool for querying natural language on tabular data.

by Abhijith Neil Abraham

0.0.10 (see all)License:GNU GPL v2
pypi i tableqa
Readme

tableQA

Tool for querying natural language on tabular data like csvs,excel sheet,etc.

Build Status Open In Colab

Features

  • Supports detection from multiple csvs
  • Supports FuzzyString implementation. i.e, incomplete csv values in query can be automatically detected and filled in the query.
  • Open-Domain, No training required.
  • Add manual schema for customized experience
  • Auto-generate schemas in case schema not provided
  • Data visualisations

Configuration:

install via pip:

pip install tableqa

installing from source:

git clone https://github.com/abhijithneilabraham/tableQA

cd tableqa

python setup.py install

Quickstart

Do sample query

from tableqa.agent import Agent
agent=Agent(df) #input your dataframe
response=agent.query_db("Your question here")
print(response)

Get an SQL query from the question

sql=agent.get_query("Your question here")  
print(sql) #returns an sql query

Adding Manual schema

Schema Format:
{
    "name": DATABASE NAME,
    "keywords":[DATABASE KEYWORDS],
    "columns":
    [
        {
        "name": COLUMN 1 NAME,
        "mapping":{
            CATEGORY 1: [CATEGORY 1 KEYWORDS],
            CATEGORY 2: [CATEGORY 2 KEYWORDS]
        }

        },
        {
        "name": COLUMN 2 NAME,
        "keywords": [COLUMN 2 KEYWORDS]
        },
        {
        "name": "COLUMN 3 NAME",
        "keywords": [COLUMN 3 KEYWORDS],
        "summable":"True"
        }
    ]
}

  • Mappings are for those columns whose values have only few distinct classes.
  • Include only the column names which need to have manual keywords or mappings.Rest will will be autogenerated.
  • summable is included for Numeric Type columns whose values are already count representations. Eg. Death Count,Cases etc. consists values which already represent a count.

Example (with manual schema):

Database query
from tableqa.agent import Agent
agent=Agent(df,schema) #pass the dataframe and schema objects
response=agent.query_db("how many people died of stomach cancer in 2011")
print(response)
#Response =[(22,)]
SQL query
sql=agent.get_query("How many people died of stomach cancer in 2011")
print(sql)
#sql query: SELECT SUM(Death_Count) FROM cancer_death WHERE Cancer_site = "Stomach" AND Year = "2011"

Multiple CSVs

Pass the absolute path of the directories containing the csvs and schemas respectively. Refer cleaned_data and schema for examples.

Example
csv_path="/content/tableQA/tableqa/cleaned_data"
schema_path="/content/tableQA/tableqa/schema"
agent=Agent(csv_path,schema_path)

Join us

Join our slack workspace:Slack

GitHub Stars

131

LAST COMMIT

1yr ago

MAINTAINERS

1

CONTRIBUTORS

8

OPEN ISSUES

9

OPEN PRs

4
VersionTagPublished
0.0.10
2yrs ago
0.0.9
2yrs ago
0.0.8
2yrs ago
0.0.7
2yrs ago
No alternatives found
No tutorials found
Add a tutorial