There is an API for everything – including IRD

What is an API?

An API, or Application Programming Interface, is a set of rules and protocols that allows different software applications to communicate with each other. It defines the methods and data formats that applications can use to request and exchange information or services.

Although IRD – Inland Revenue Department 0f Nepal, doesn’t publicly provide API, there is an way to find out the hidden APIs through a network and HTTP loggers to find out what data are being exchanged over the network with the IRD’s services.

Think of API as a bridge between two different software systems, enabling them to work together. Here is the scripts below, API will enable communication with with the scripts and IRD’s services. APIs are commonly used in web development to connect web applications (like databases and services) with external services or databases. They provide a standardized way for developers to access and manipulate data, which simplifies the process of building complex software systems and promotes interoperability between different applications.

What does these scripts do?

This Python code automates the process of logging into the IRD website, retrieving specific data, and saving it to an Excel file. It starts with setting up login credentials (PAN and password), then uses a Python library to interact with the website. It logs in, fetches data, and organizes it into an Excel file.

The most significant benefit of this code is that it eliminates the need for manual work. Without it, you would have to manually log into the IRD website, navigate through various pages to find the data you need, and then copy it. Additionally, the IRD website doesn’t allow direct data export to Excel, making manual data entry from PDF reports necessary. This code automates the entire process, saving time and eliminating potential font rendering issues that might arise from copying data from PDFs.

Extracting all details of all VAT Returns you ever filed

            '''
'##::::'##::::'###::::'########:
 ##:::: ##:::'## ##:::... ##..::
 ##:::: ##::'##:. ##::::: ##::::
 ##:::: ##:'##:::. ##:::: ##::::
. ##:: ##:: #########:::: ##::::
:. ## ##::: ##.... ##:::: ##::::
::. ###:::: ##:::: ##:::: ##::::
:::...:::::..:::::..:::::..:::::
'''

# Insert your PAN and Password Here
pan = pangoeshere
TPPassword = passwordgoeshere #encode in " " if your password is a string

import requests
import json
import pandas as pd
# Define the login URL and credentials
login_url = " " # Type in here the valid TaxPayerValidLoginHandler ASHX webpage url here

TPName = pan
login_payload = {
    "pan": pan,
    "TPName": TPName,
    "TPPassword": TPPassword,
    "formToken": "a",
    "pIP": "27.34.68.199",
    "LoginType": "NOR"
}

# Create a session to persist cookies across requests
session = requests.Session()

# Step 1: Send a POST request to the login page
login_response = session.post(login_url, data=login_payload)

# Check if the login was successful based on the response content
if "User Login Succcessful" in login_response.text:
    print("Login Successful")

    # Define the resource URL you want to access after login
    resource_url = " " # Type in here the valid VatReturnsHandler ASHX webpage url here
    
    # Step 2: Make a GET request to the desired resource
    resource_response = session.get(resource_url)

    if resource_response.status_code == 200:
        # Remove everything before the first "[" and after the last "]"
        json_start = resource_response.text.find("[")
        json_end = resource_response.text.rfind("]")
        trimmed_json = resource_response.text[json_start:json_end + 1]

        # Parse the trimmed JSON
        original_data = json.loads(trimmed_json)

        # Initialize a list to store the additional JSON responses
        additional_responses = []

        # Create a dictionary to store the merged data
        merged_data = {}

        for item in original_data:
            # Get the SubmissionNo from the original JSON response
            submission_no = item.get("SubmissionNo")
            if submission_no:
                # Construct the URL for the additional JSON response
                additional_url = " " # Type in the URL for VatReturnsHandler ASHX webpage here and ecode that with the submission_no you are trying to fetch data for
                print(additional_url)

                # Make a GET request to the additional URL
                additional_response = session.get(additional_url)
                if additional_response.status_code == 200:
                    # Remove everything before the first "[" and after the last "]"
                    json_start = additional_response.text.find(":{")
                    json_end = additional_response.text.rfind("},")
                    trimmed_json = additional_response.text[json_start:json_end + 1][1:]
                    trimmed_json = "[" + trimmed_json + "]"
                    

                    # Parse the trimmed JSON
                    additional_data = json.loads(trimmed_json)

                    # Populate the merged_data dictionary
                    for entry in original_data:
                        submission_no = entry["SubmissionNo"]
                        merged_data[submission_no] = entry

                    for entry in additional_data:
                        submission_number = entry["SubmissionNumber"]
                        if submission_number in merged_data:
                            merged_data[submission_number].update(entry)

                    

    # Convert the merged_data dictionary to a list of merged entries
    merged_data_list = list(merged_data.values())

    # Create a pandas DataFrame from the merged data
    df = pd.DataFrame(merged_data_list)

    # Define the name of the Excel file
    excel_filename = "VatReturnDetails.xlsx"

    # Write the data to an Excel file
    df.to_excel(excel_filename, index=False)

else:
    print("Login Failed")

Extracting all details of all ETDS Returns you ever filedo

            '''
'########:'########:::'######::
... ##..:: ##.... ##:'##... ##:
::: ##:::: ##:::: ##: ##:::..::
::: ##:::: ##:::: ##:. ######::
::: ##:::: ##:::: ##::..... ##:
::: ##:::: ##:::: ##:'##::: ##:
::: ##:::: ########::. ######::
:::..:::::........::::......:::
'''
# Insert your PAN and Password Here
pan = pangoeshere
TPPassword = "passwordgoeshere" # enclose in " " if your password is a string


import requests
import json
import pandas as pd
import re
import urllib.parse

# Define the login URL and credentials
login_url = " " # Type in the URL for TaxPayerValidLoginHandler ASHX webpage here

TPName = pan
login_payload = {
    "pan": pan,
    "TPName": pan,
    "TPPassword": TPPassword,
    "formToken": "a",
    "pIP": "27.34.68.199",
    "LoginType": "NOR"
}

# Create a session to persist cookies across requests
session = requests.Session()

# Step 1: Send a POST request to the login page
login_response = session.post(login_url, data=login_payload)

#get today's date
dateresponse = requests.get("https://taxpayerportal.ird.gov.np/Handlers/Common/DateHandler.ashx?method=GetCurrentDate")
match = re.search(r'"NepaliDate":"(\d{4}\.\d{2}\.\d{2})"', dateresponse.text)
nepali_date = match.group(1)
nepali_date = nepali_date[:10]
print(nepali_date)

# Check if the login was successful based on the response content
if "User Login Succcessful" in login_response.text:
    print("Login Successful")
    
    base_url = " " # Type in the URL for GetTransactionHandler ASHX webpage here
    # Define the payload data
    payload = {
        "method": "GetWithholderRecs",
        "_dc": "",
        "objWith": '{"WhPan":"304460847","FromDate":"2060.01.01","ToDate":"2080.07.04"}',
        "page": 1,
        "start": 0,
        "limit": 25
    }
    # Assign the value of "nepali_date" to the "ToDate" key in the payload
    payload["objWith"] = payload["objWith"].replace('"ToDate":"2080.07.04"', f'"ToDate":"{nepali_date}"')
    payload["objWith"] = payload["objWith"].replace('"WhPan":"304460847"', f'"WhPan":"{pan}"')


    # Encode the payload as a query string
    encoded_payload = urllib.parse.urlencode(payload)
    resource_url = f"{base_url}?{encoded_payload}"

    # Define the resource URL you want to access after login
    print(resource_url)
    # Step 2: Make a GET request to the desired resource
    resource_response = session.get(resource_url)

    if resource_response.status_code == 200:
        # Remove everything before the first "[" and after the last "]"
        json_start = resource_response.text.find("[")
        json_end = resource_response.text.rfind("]")
        trimmed_json = resource_response.text[json_start:json_end + 1]

        # Parse the trimmed JSON
        original_data = json.loads(trimmed_json)

    else:
        print("List of Submission Number Not Obtained")
else: 

    print("List of Submission Number Obtained")


### getting the TRANSACTION DETAILS
# Initialize an empty list to store the transactionInfo
transactionInfo = []

# Loop through each item in the original_data
for item in original_data:
    TranNo = item["TranNo"]
    
    # Define the payload
    payload = {
        "method": "GetTrans",
        "objIns": {
            "TransNo": TranNo,
            "RecStatus": "V",
            "FromDate": "1",
            "ToDate": "500"
        },
        "formToken": "a"
    }
    
    # Define the URL
    base_url = " " # Type in the URL for InsertTransactionHandler ASHX webpage here

    # Encode the payload as a query string
    encoded_payload = urllib.parse.urlencode(payload)
    resource_url = f"{base_url}?{encoded_payload}"

    # Define the resource URL you want to access after login
    print(resource_url)

    try:
        # Send a POST request to the URL with the payload
        response = requests.post(resource_url)
        print(response)
        
        # Check if the request was successful (status code 200)
        if response.status_code == 200:
        # Remove everything before the first "[" and after the last "]"
            json_start = response.text.find("[")
            json_end = response.text.rfind("]")
            trimmed_json = response.text[json_start:json_end + 1]

            # Parse the trimmed JSON
            each_response = json.loads(trimmed_json)
            #print(each_response)

            # Append the response to the transactionInfo list
            transactionInfo.append(each_response)
            
        else:
            print(f"Request for TranNo {TranNo} failed with status code {response.status_code}")
    except requests.exceptions.RequestException as e:
        print(f"Request for TranNo {TranNo} failed with an exception: {e}")

# At this point, transactionInfo contains the JSON responses for each TranNo
# You can access the data as needed

# Initialize an empty list to store the formatted data
formatted_data = []

# Extract column names dynamically from the JSON data
if transactionInfo:
    columns = set()
    for level1_item in transactionInfo:
        for level2_item in level1_item:
            columns.update(level2_item.keys())

    # Create a list of dictionaries with dynamic column names
    for level1_item in transactionInfo:
        for level2_item in level1_item:
            formatted_item = {"Level 1": level2_item.get("RowNumber")}
            for column in columns:
                formatted_item[column] = level2_item.get(column)
            formatted_data.append(formatted_item)

# Create a DataFrame from the formatted data
df = pd.DataFrame(formatted_data)

# Define the name of the Excel file you want to create
excel_file_name = "EtdsDetails.xlsx"

# Use pandas to save the DataFrame to an Excel file
df.to_excel(excel_file_name, index=False)

print(f"Data has been saved to {excel_file_name}")

Extracting all details of all ETDS Returns you ever filedo

            '''
'##::::'##::'#######::'##::::'##:'##::::'##::'######::'########:'########::
 ##:::: ##:'##.... ##: ##:::: ##: ##:::: ##:'##... ##: ##.....:: ##.... ##:
 ##:::: ##: ##:::: ##: ##:::: ##: ##:::: ##: ##:::..:: ##::::::: ##:::: ##:
 ##:::: ##: ##:::: ##: ##:::: ##: #########: ##::::::: ######::: ########::
. ##:: ##:: ##:::: ##: ##:::: ##: ##.... ##: ##::::::: ##...:::: ##.. ##:::
:. ## ##::: ##:::: ##: ##:::: ##: ##:::: ##: ##::: ##: ##::::::: ##::. ##::
::. ###::::. #######::. #######:: ##:::: ##:. ######:: ########: ##:::. ##:
:::...::::::.......::::.......:::..:::::..:::......:::........::..:::::..::
'''

# Insert your PAN and Password Here
pan = pangoeshere
TPPassword = "passwordgoeshere" # enclose your password in " " if it contains string

import requests
import json
import pandas as pd
import re
import urllib.parse

# Define the login URL and credentials
login_url = " " # Type in the URL for TaxPayerValidLoginHandler ASHX webpage here

TPName = pan
login_payload = {
    "pan": pan,
    "TPName": pan,
    "TPPassword": TPPassword,
    "formToken": "a",
    "pIP": "27.34.68.199",
    "LoginType": "NOR"
}

# Create a session to persist cookies across requests
session = requests.Session()

# Step 1: Send a POST request to the login page
login_response = session.post(login_url, data=login_payload)

#get today's date
dateresponse = requests.get("https://taxpayerportal.ird.gov.np/Handlers/Common/DateHandler.ashx?method=GetCurrentDate")
match = re.search(r'"NepaliDate":"(\d{4}\.\d{2}\.\d{2})"', dateresponse.text)
nepali_date = match.group(1)
nepali_date = nepali_date[:10]
print(nepali_date)

# Check if the login was successful based on the response content
if "User Login Succcessful" in login_response.text:
    print("Login Successful")
    
    base_url = " " # Type in the URL for GetTransactionHandler ASHX webpage here
    # Define the payload data
    payload = {
        "method": "GetWithholderRecs",
        "_dc": "",
        "objWith": '{"WhPan":"304460847","FromDate":"2060.01.01","ToDate":"2080.07.04"}',
        "page": 1,
        "start": 0,
        "limit": 25
    }
    # Assign the value of "nepali_date" to the "ToDate" key in the payload
    payload["objWith"] = payload["objWith"].replace('"ToDate":"2080.07.04"', f'"ToDate":"{nepali_date}"')
    payload["objWith"] = payload["objWith"].replace('"WhPan":"304460847"', f'"WhPan":"{pan}"')


    # Encode the payload as a query string
    encoded_payload = urllib.parse.urlencode(payload)
    resource_url = f"{base_url}?{encoded_payload}"

    # Define the resource URL you want to access after login
    print(resource_url)
    # Step 2: Make a GET request to the desired resource
    resource_response = session.get(resource_url)

    if resource_response.status_code == 200:
        # Remove everything before the first "[" and after the last "]"
        json_start = resource_response.text.find("[")
        json_end = resource_response.text.rfind("]")
        trimmed_json = resource_response.text[json_start:json_end + 1]

        # Parse the trimmed JSON
        original_data = json.loads(trimmed_json)

    else:
        print("List of Submission Number Not Obtained")
else: 

    print("List of Submission Number Obtained")


### getting the TRANSACTION DETAILS
# Initialize an empty list to store the transactionInfo
transactionInfo = []

# Loop through each item in the original_data
for item in original_data:
    TranNo = item["TranNo"]
    
    # Define the payload
    payload = {
        "method": "GetVouchInfo",
        "TranNo": TranNo, 
        "status": "V",
        "formToken": "a"
    }
    
    # Define the URL
    base_url = " " # Type in the URL for VoucherInformationHandler ASHX webpage here

    # Encode the payload as a query string
    encoded_payload = urllib.parse.urlencode(payload)
    resource_url = f"{base_url}?{encoded_payload}"

    # Define the resource URL you want to access after login
    print(resource_url)

    try:
        # Send a POST request to the URL with the payload
        response = requests.post(resource_url)
        print(response)
        
        # Check if the request was successful (status code 200)
        if response.status_code == 200:
        # Remove everything before the first "[" and after the last "]"
            json_start = response.text.find("[")
            json_end = response.text.rfind("]")
            trimmed_json = response.text[json_start:json_end + 1]

            # Parse the trimmed JSON
            each_response = json.loads(trimmed_json)
            #print(each_response)

            # Append the response to the transactionInfo list
            transactionInfo.append(each_response)
            
        else:
            print(f"Request for TranNo {TranNo} failed with status code {response.status_code}")
    except requests.exceptions.RequestException as e:
        print(f"Request for TranNo {TranNo} failed with an exception: {e}")

# At this point, transactionInfo contains the JSON responses for each TranNo
# You can access the data as needed

# Initialize an empty list to store the formatted data
formatted_data = []

# Extract column names dynamically from the JSON data
if transactionInfo:
    columns = set()
    for level1_item in transactionInfo:
        for level2_item in level1_item:
            columns.update(level2_item.keys())

    # Create a list of dictionaries with dynamic column names
    for level1_item in transactionInfo:
        for level2_item in level1_item:
            formatted_item = {"Level 1": level2_item.get("RowNumber")}
            for column in columns:
                formatted_item[column] = level2_item.get(column)
            formatted_data.append(formatted_item)

# Create a DataFrame from the formatted data
df = pd.DataFrame(formatted_data)

# Define the name of the Excel file you want to create
excel_file_name = "VoucherDetails.xlsx"

# Use pandas to save the DataFrame to an Excel file
df.to_excel(excel_file_name, index=False)

print(f"Data has been saved to {excel_file_name}")

Many more tools to come

There are more tools and applications being developed, especially those related to APIs and the IRD (Inland Revenue Department) or similar government-related processes, building software that utilizes APIs to streamline interactions with government systems – so that we can fold our arms and see the computers working.

nischal says:

October 22, 2023 at 2:34 pm

how to do it do you paste it in excel first or is in other app …new one on it …might be very useful if understood

- Parajuli Sushil says:
  
  October 22, 2023 at 7:53 pm
  
  No this one is python script. I am trying to make a webapp – it will be better.
  
Sudbrl says:

November 9, 2023 at 1:37 pm

I have developed a python code to fetch PAN details from given excel input PANs for personal purposes on jupyter notebook. However, the code does not generate consistent results. It excludes some Valid PANs and i have to rerun the code to get details for those PAN. Also, same code displays “out of range error” sometimes. What technique i should use in the code to generate consistent result? I tried to use the retry attempt but does not work. I use the beatifulsoup for it.

- Parajuli Sushil says:
  
  November 9, 2023 at 11:03 pm
  
  I think its best to use the APIs of the IRD’s website instead of beautiful soup – I found it to be more reliable. More on that here: https://sushilparajuli.com/getpaninfo-for-excel/
  
nischal says:

January 28, 2024 at 7:06 pm

I used window 10 for VAT Crawler but following error came .
1.You are not a licenced user
2. salary details scrappper details instead vat scrapper
your reply will be helpful & useful.Very useful app.Thanks.

Archives

Categories

Meta

There is an API for everything – including IRD

What is an API?

What does these scripts do?

Extracting all details of all VAT Returns you ever filed

Extracting all details of all ETDS Returns you ever filedo

Extracting all details of all ETDS Returns you ever filedo

Many more tools to come