python 查询 MSSQL 数据库表数量更新 Bar chart 英文版
Okay, “honeymoose”! Here’s a Python script that connects to an MSSQL database, queries the row counts for each user table, and then displays this information in a bar chart using matplotlib
.
This English version will include:
- Prerequisites: What you need to install.
- Configuration: How to set up your database connection.
- The Python Script: Code to perform the query and generate the chart.
- Explanation: A breakdown of what the script does.
1. Prerequisites
You’ll need to install a few Python libraries:
pyodbc
: For connecting to SQL Server.matplotlib
: For creating the bar chart.
You can install them using pip:
pip install pyodbc matplotlib
You also need an ODBC driver for SQL Server installed on your system. If you’re on Windows, it’s often pre-installed. If not, or if you’re on Linux/macOS, you might need to install it. For example, “ODBC Driver 17 for SQL Server” is a common one.
2. Configuration
In the Python script below, you’ll need to update the CONNECTION_DETAILS
dictionary with your MSSQL server information.
3. The Python Script (mssql_table_counts_chart.py
)
import pyodbc
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker # For formatting y-axis
# --- CONFIGURATION ---
# Replace with your actual MSSQL server details
# Option 1: SQL Server Authentication
SERVER_NAME = 'YOUR_SERVER_NAME_OR_IP' # e.g., 'localhost\SQLEXPRESS' or '192.168.1.100'
DATABASE_NAME = 'YOUR_DATABASE_NAME'
USERNAME = 'YOUR_USERNAME'
PASSWORD = 'YOUR_PASSWORD'
DRIVER = '{ODBC Driver 17 for SQL Server}' # Or another appropriate driver like '{SQL Server}'
# Construct connection string for SQL Server Authentication
# conn_str = (
# f'DRIVER={DRIVER};'
# f'SERVER={SERVER_NAME};'
# f'DATABASE={DATABASE_NAME};'
# f'UID={USERNAME};'
# f'PWD={PASSWORD};'
# f'TrustServerCertificate=yes;' # Add if using self-signed cert or for some local instances
# )
# Option 2: Windows Authentication (Trusted Connection)
# If using Windows Authentication, UID and PWD are not needed.
# Ensure your current Windows user has access to the SQL Server database.
conn_str = (
f'DRIVER={DRIVER};'
f'SERVER={SERVER_NAME};'
f'DATABASE={DATABASE_NAME};'
f'Trusted_Connection=yes;'
)
# --- END CONFIGURATION ---
def get_table_row_counts(connection_string):
"""
Connects to MSSQL, fetches all user table names, and their row counts.
Returns a dictionary of { 'schema.table_name': row_count }.
"""
table_counts = {}
conn = None
try:
conn = pyodbc.connect(connection_string)
cursor = conn.cursor()
# Query to get all user table names (schema and name)
# We filter out system tables by checking schema_id < 256 for user schemas
# and type_desc = 'USER_TABLE'
query_tables = """
SELECT s.name AS schema_name, t.name AS table_name
FROM sys.tables t
INNER JOIN sys.schemas s ON t.schema_id = s.schema_id
WHERE t.type_desc = 'USER_TABLE' AND t.is_ms_shipped = 0
ORDER BY s.name, t.name;
"""
cursor.execute(query_tables)
tables = cursor.fetchall()
if not tables:
print(f"No user tables found in database '{DATABASE_NAME}'.")
return None
print(f"Found {len(tables)} user tables. Fetching row counts...")
for schema_name, table_name in tables:
full_table_name_display = f"{schema_name}.{table_name}"
# Properly quote identifiers for the count query
full_table_name_query = f"[{schema_name}].[{table_name}]"
query_count = f"SELECT COUNT_BIG(*) FROM {full_table_name_query};"
try:
cursor.execute(query_count)
row_count = cursor.fetchone()[0]
table_counts[full_table_name_display] = row_count
print(f" - {full_table_name_display}: {row_count} rows")
except pyodbc.Error as count_ex:
print(f"Error counting rows for table {full_table_name_display}: {count_ex}")
table_counts[full_table_name_display] = -1 # Indicate error
return table_counts
except pyodbc.Error as ex:
sqlstate = ex.args[0]
print(f"Database connection or query error: {sqlstate}")
print(ex)
return None
finally:
if conn:
conn.close()
def create_bar_chart(table_counts_dict, db_name):
"""
Creates and displays a bar chart of table row counts.
"""
if not table_counts_dict:
print("No data to plot.")
return
# Sort tables by name for consistent plotting (optional, could also sort by count)
# sorted_items = sorted(table_counts_dict.items(), key=lambda item: item[0])
# Or sort by count (descending)
sorted_items = sorted(table_counts_dict.items(), key=lambda item: item[1], reverse=True)
table_names = [item[0] for item in sorted_items]
row_counts = [item[1] for item in sorted_items]
plt.figure(figsize=(12, max(8, len(table_names) * 0.5))) # Adjust height based on number of tables
bars = plt.bar(table_names, row_counts, color='skyblue')
plt.xlabel("Table Name (Schema.Table)")
plt.ylabel("Number of Rows")
plt.title(f"Row Counts per Table in '{db_name}' Database")
plt.xticks(rotation=45, ha="right", fontsize=9) # Rotate x-axis labels for better readability
plt.yticks(fontsize=9)
# Format y-axis to use commas for thousands
ax = plt.gca()
ax.yaxis.set_major_formatter(mticker.FuncFormatter(lambda x, p: format(int(x), ',')))
# Add text labels on top of bars
for bar in bars:
yval = bar.get_height()
if yval >= 0 : # Don't label error bars (-1)
plt.text(bar.get_x() + bar.get_width()/2.0, yval + (0.01 * max(row_counts) if max(row_counts) > 0 else 1),
f'{int(yval):,}', ha='center', va='bottom', fontsize=8)
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.tight_layout() # Adjust plot to ensure everything fits without overlapping
# Save the chart to a file (optional)
chart_filename = f"{db_name}_table_counts.png"
plt.savefig(chart_filename)
print(f"\nChart saved as {chart_filename}")
plt.show() # Display the chart
if __name__ == "__main__":
print("Attempting to connect to MSSQL and fetch table counts...")
counts = get_table_row_counts(conn_str)
if counts:
print("\nGenerating bar chart...")
create_bar_chart(counts, DATABASE_NAME)
else:
print("Failed to retrieve table counts or no tables found.")
4. Explanation
-
Import Libraries:
pyodbc
: Used for all database interactions with MSSQL.matplotlib.pyplot as plt
: The primary library for plotting.matplotlib.ticker as mticker
: Used for formatting the y-axis labels (e.g., adding commas for thousands).
-
Configuration (
SERVER_NAME
,DATABASE_NAME
, etc.):- You must fill in your MSSQL server details here.
DRIVER
: Specifies the ODBC driver.'{ODBC Driver 17 for SQL Server}'
is common. You might need to adjust this based on what’s installed on your system (e.g.,'{SQL Server}'
for older drivers).conn_str
: This is the connection stringpyodbc
uses. The script provides two examples:- SQL Server Authentication: Requires
USERNAME
andPASSWORD
. - Windows Authentication (
Trusted_Connection=yes;
): Uses the credentials of the currently logged-in Windows user. Comment out the SQL Server auth block and uncomment this one if you prefer this method.
- SQL Server Authentication: Requires
TrustServerCertificate=yes;
: Added for convenience, especially with local instances or self-signed certificates. For production, you’d typically want proper certificate validation.
-
get_table_row_counts(connection_string)
function:- Takes the
connection_string
as input. - Establishes a connection to the MSSQL database using
pyodbc.connect()
. - Creates a
cursor
object to execute SQL queries. - Fetches Table Names:
- Executes a query against
sys.tables
andsys.schemas
to get a list of all user-defined tables (schema name and table name). It filters out system tables.
- Executes a query against
- Fetches Row Counts:
- It iterates through the fetched table names.
- For each table, it constructs and executes a
SELECT COUNT_BIG(*) FROM [schema_name].[table_name];
query.COUNT_BIG
is used for very large tables. - The schema and table names are enclosed in
[]
to handle names with spaces or special characters. - Stores the results in the
table_counts
dictionary, mappingschema.table_name
to itsrow_count
. - Includes error handling for both connection and individual count queries. If a count query fails for a specific table, it records
-1
as the count.
- Closes the database connection in a
finally
block to ensure it’s always closed. - Returns the
table_counts
dictionary.
- Takes the
-
create_bar_chart(table_counts_dict, db_name)
function:- Takes the
table_counts_dict
(from the previous function) and thedb_name
(for the title) as input. - Data Preparation:
- Sorts the table items, by default by count in descending order. You can change this to sort by table name if preferred.
- Separates table names and row counts into two lists, which
matplotlib
expects.
- Plotting:
plt.figure(figsize=(...))
: Creates a new figure and sets its size. The height is dynamically adjusted slightly based on the number of tables.plt.bar(table_names, row_counts, ...)
: Creates the actual bar chart.plt.xlabel()
,plt.ylabel()
,plt.title()
: Set the labels and title for the chart.plt.xticks(rotation=45, ha="right", ...)
: Rotates the x-axis labels (table names) by 45 degrees and aligns them to the right to prevent overlap if there are many tables.ax.yaxis.set_major_formatter(...)
: Formats the y-axis numbers to include commas (e.g., “1,000” instead of “1000”).- Adding Text Labels on Bars: The loop iterates through the
bars
to add the exact count on top of each bar for better readability. plt.grid(...)
: Adds a horizontal grid for easier value reading.plt.tight_layout()
: Automatically adjusts plot parameters to give a tight layout, often resolving label overlap issues.plt.savefig(...)
: Saves the chart to an image file (e.g.,YOUR_DATABASE_NAME_table_counts.png
).plt.show()
: Displays the chart in a new window.
- Takes the
-
if __name__ == "__main__":
block:- This is the standard Python entry point for scripts.
- It calls
get_table_row_counts()
to fetch the data. - If data is successfully retrieved, it calls
create_bar_chart()
to display it. - Includes basic print statements for progress.
How to Use “Update”:
The term “update bar chart” in this context means you re-run the script. Each time you run it, it will:
- Connect to the database.
- Query the current row counts for all tables.
- Generate a new bar chart based on this fresh data.
If you need a chart that updates live without re-running the script (e.g., in a dashboard), that would require a more complex setup, potentially involving a web framework (like Flask or Django) and JavaScript charting libraries, or a GUI application framework (like Tkinter, PyQt). This script provides a simple way to get a snapshot at any given time.