Python 查询 MSSQL 数据库表数量更新 Bar chart 英文版

python 查询 MSSQL 数据库表数量更新 Bar chart 英文版

Okay, “honeymoose”! Here’s a Python script that connects to an MSSQL database, queries the row counts for each user table, and then displays this information in a bar chart using matplotlib.

This English version will include:

  1. Prerequisites: What you need to install.
  2. Configuration: How to set up your database connection.
  3. The Python Script: Code to perform the query and generate the chart.
  4. Explanation: A breakdown of what the script does.

1. Prerequisites

You’ll need to install a few Python libraries:

  • pyodbc: For connecting to SQL Server.
  • matplotlib: For creating the bar chart.

You can install them using pip:

pip install pyodbc matplotlib

You also need an ODBC driver for SQL Server installed on your system. If you’re on Windows, it’s often pre-installed. If not, or if you’re on Linux/macOS, you might need to install it. For example, “ODBC Driver 17 for SQL Server” is a common one.


2. Configuration

In the Python script below, you’ll need to update the CONNECTION_DETAILS dictionary with your MSSQL server information.


3. The Python Script (mssql_table_counts_chart.py)

import pyodbc
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker # For formatting y-axis

# --- CONFIGURATION ---
# Replace with your actual MSSQL server details
# Option 1: SQL Server Authentication
SERVER_NAME = 'YOUR_SERVER_NAME_OR_IP'  # e.g., 'localhost\SQLEXPRESS' or '192.168.1.100'
DATABASE_NAME = 'YOUR_DATABASE_NAME'
USERNAME = 'YOUR_USERNAME'
PASSWORD = 'YOUR_PASSWORD'
DRIVER = '{ODBC Driver 17 for SQL Server}' # Or another appropriate driver like '{SQL Server}'

# Construct connection string for SQL Server Authentication
# conn_str = (
#     f'DRIVER={DRIVER};'
#     f'SERVER={SERVER_NAME};'
#     f'DATABASE={DATABASE_NAME};'
#     f'UID={USERNAME};'
#     f'PWD={PASSWORD};'
#     f'TrustServerCertificate=yes;' # Add if using self-signed cert or for some local instances
# )

# Option 2: Windows Authentication (Trusted Connection)
# If using Windows Authentication, UID and PWD are not needed.
# Ensure your current Windows user has access to the SQL Server database.
conn_str = (
    f'DRIVER={DRIVER};'
    f'SERVER={SERVER_NAME};'
    f'DATABASE={DATABASE_NAME};'
    f'Trusted_Connection=yes;'
)
# --- END CONFIGURATION ---

def get_table_row_counts(connection_string):
    """
    Connects to MSSQL, fetches all user table names, and their row counts.
    Returns a dictionary of { 'schema.table_name': row_count }.
    """
    table_counts = {}
    conn = None
    try:
        conn = pyodbc.connect(connection_string)
        cursor = conn.cursor()

        # Query to get all user table names (schema and name)
        # We filter out system tables by checking schema_id < 256 for user schemas
        # and type_desc = 'USER_TABLE'
        query_tables = """
            SELECT s.name AS schema_name, t.name AS table_name
            FROM sys.tables t
            INNER JOIN sys.schemas s ON t.schema_id = s.schema_id
            WHERE t.type_desc = 'USER_TABLE' AND t.is_ms_shipped = 0
            ORDER BY s.name, t.name;
        """
        cursor.execute(query_tables)
        tables = cursor.fetchall()

        if not tables:
            print(f"No user tables found in database '{DATABASE_NAME}'.")
            return None

        print(f"Found {len(tables)} user tables. Fetching row counts...")

        for schema_name, table_name in tables:
            full_table_name_display = f"{schema_name}.{table_name}"
            # Properly quote identifiers for the count query
            full_table_name_query = f"[{schema_name}].[{table_name}]"
            
            query_count = f"SELECT COUNT_BIG(*) FROM {full_table_name_query};"
            try:
                cursor.execute(query_count)
                row_count = cursor.fetchone()[0]
                table_counts[full_table_name_display] = row_count
                print(f" - {full_table_name_display}: {row_count} rows")
            except pyodbc.Error as count_ex:
                print(f"Error counting rows for table {full_table_name_display}: {count_ex}")
                table_counts[full_table_name_display] = -1 # Indicate error

        return table_counts

    except pyodbc.Error as ex:
        sqlstate = ex.args[0]
        print(f"Database connection or query error: {sqlstate}")
        print(ex)
        return None
    finally:
        if conn:
            conn.close()

def create_bar_chart(table_counts_dict, db_name):
    """
    Creates and displays a bar chart of table row counts.
    """
    if not table_counts_dict:
        print("No data to plot.")
        return

    # Sort tables by name for consistent plotting (optional, could also sort by count)
    # sorted_items = sorted(table_counts_dict.items(), key=lambda item: item[0]) 
    # Or sort by count (descending)
    sorted_items = sorted(table_counts_dict.items(), key=lambda item: item[1], reverse=True)
    
    table_names = [item[0] for item in sorted_items]
    row_counts = [item[1] for item in sorted_items]

    plt.figure(figsize=(12, max(8, len(table_names) * 0.5))) # Adjust height based on number of tables

    bars = plt.bar(table_names, row_counts, color='skyblue')

    plt.xlabel("Table Name (Schema.Table)")
    plt.ylabel("Number of Rows")
    plt.title(f"Row Counts per Table in '{db_name}' Database")
    plt.xticks(rotation=45, ha="right", fontsize=9) # Rotate x-axis labels for better readability
    plt.yticks(fontsize=9)

    # Format y-axis to use commas for thousands
    ax = plt.gca()
    ax.yaxis.set_major_formatter(mticker.FuncFormatter(lambda x, p: format(int(x), ',')))

    # Add text labels on top of bars
    for bar in bars:
        yval = bar.get_height()
        if yval >= 0 : # Don't label error bars (-1)
            plt.text(bar.get_x() + bar.get_width()/2.0, yval + (0.01 * max(row_counts) if max(row_counts) > 0 else 1), 
                     f'{int(yval):,}', ha='center', va='bottom', fontsize=8)

    plt.grid(axis='y', linestyle='--', alpha=0.7)
    plt.tight_layout() # Adjust plot to ensure everything fits without overlapping
    
    # Save the chart to a file (optional)
    chart_filename = f"{db_name}_table_counts.png"
    plt.savefig(chart_filename)
    print(f"\nChart saved as {chart_filename}")

    plt.show() # Display the chart

if __name__ == "__main__":
    print("Attempting to connect to MSSQL and fetch table counts...")
    counts = get_table_row_counts(conn_str)

    if counts:
        print("\nGenerating bar chart...")
        create_bar_chart(counts, DATABASE_NAME)
    else:
        print("Failed to retrieve table counts or no tables found.")

4. Explanation

  1. Import Libraries:

    • pyodbc: Used for all database interactions with MSSQL.
    • matplotlib.pyplot as plt: The primary library for plotting.
    • matplotlib.ticker as mticker: Used for formatting the y-axis labels (e.g., adding commas for thousands).
  2. Configuration (SERVER_NAME, DATABASE_NAME, etc.):

    • You must fill in your MSSQL server details here.
    • DRIVER: Specifies the ODBC driver. '{ODBC Driver 17 for SQL Server}' is common. You might need to adjust this based on what’s installed on your system (e.g., '{SQL Server}' for older drivers).
    • conn_str: This is the connection string pyodbc uses. The script provides two examples:
      • SQL Server Authentication: Requires USERNAME and PASSWORD.
      • Windows Authentication (Trusted_Connection=yes;): Uses the credentials of the currently logged-in Windows user. Comment out the SQL Server auth block and uncomment this one if you prefer this method.
    • TrustServerCertificate=yes;: Added for convenience, especially with local instances or self-signed certificates. For production, you’d typically want proper certificate validation.
  3. get_table_row_counts(connection_string) function:

    • Takes the connection_string as input.
    • Establishes a connection to the MSSQL database using pyodbc.connect().
    • Creates a cursor object to execute SQL queries.
    • Fetches Table Names:
      • Executes a query against sys.tables and sys.schemas to get a list of all user-defined tables (schema name and table name). It filters out system tables.
    • Fetches Row Counts:
      • It iterates through the fetched table names.
      • For each table, it constructs and executes a SELECT COUNT_BIG(*) FROM [schema_name].[table_name]; query. COUNT_BIG is used for very large tables.
      • The schema and table names are enclosed in [] to handle names with spaces or special characters.
      • Stores the results in the table_counts dictionary, mapping schema.table_name to its row_count.
      • Includes error handling for both connection and individual count queries. If a count query fails for a specific table, it records -1 as the count.
    • Closes the database connection in a finally block to ensure it’s always closed.
    • Returns the table_counts dictionary.
  4. create_bar_chart(table_counts_dict, db_name) function:

    • Takes the table_counts_dict (from the previous function) and the db_name (for the title) as input.
    • Data Preparation:
      • Sorts the table items, by default by count in descending order. You can change this to sort by table name if preferred.
      • Separates table names and row counts into two lists, which matplotlib expects.
    • Plotting:
      • plt.figure(figsize=(...)): Creates a new figure and sets its size. The height is dynamically adjusted slightly based on the number of tables.
      • plt.bar(table_names, row_counts, ...): Creates the actual bar chart.
      • plt.xlabel(), plt.ylabel(), plt.title(): Set the labels and title for the chart.
      • plt.xticks(rotation=45, ha="right", ...): Rotates the x-axis labels (table names) by 45 degrees and aligns them to the right to prevent overlap if there are many tables.
      • ax.yaxis.set_major_formatter(...): Formats the y-axis numbers to include commas (e.g., “1,000” instead of “1000”).
      • Adding Text Labels on Bars: The loop iterates through the bars to add the exact count on top of each bar for better readability.
      • plt.grid(...): Adds a horizontal grid for easier value reading.
      • plt.tight_layout(): Automatically adjusts plot parameters to give a tight layout, often resolving label overlap issues.
      • plt.savefig(...): Saves the chart to an image file (e.g., YOUR_DATABASE_NAME_table_counts.png).
      • plt.show(): Displays the chart in a new window.
  5. if __name__ == "__main__": block:

    • This is the standard Python entry point for scripts.
    • It calls get_table_row_counts() to fetch the data.
    • If data is successfully retrieved, it calls create_bar_chart() to display it.
    • Includes basic print statements for progress.

How to Use “Update”:

The term “update bar chart” in this context means you re-run the script. Each time you run it, it will:

  1. Connect to the database.
  2. Query the current row counts for all tables.
  3. Generate a new bar chart based on this fresh data.

If you need a chart that updates live without re-running the script (e.g., in a dashboard), that would require a more complex setup, potentially involving a web framework (like Flask or Django) and JavaScript charting libraries, or a GUI application framework (like Tkinter, PyQt). This script provides a simple way to get a snapshot at any given time.