Polars MariaDB
Leer tabla de base de datos¶
Referencias¶
Documentaci贸n API Polars
Connector-x: Supported Sources & Destinations
pip install polars connectorx
import polars as pl
Lectura v铆a connectorx¶
Publicaci贸n VLDB 2022: ConnectorX: Accelerating Data Loading From Databases to Dataframes
El c贸digo siguiente permite leer los datos desde una tabla en la BD a una tabla Arrow¶
import connectorx as cx
import pyarrow.compute as pc
conn = 'mysql://usuario:contrase帽a@mariadb:3306/kafka'
query = 'SELECT * FROM songs'
# Leer produciendo una tabla arrow
ta = cx.read_sql(conn, query, return_type='arrow')
print("Tabla Arrow:", ta.schema)
pjta = ta.filter(pc.equal(ta['artist'], 'Pearl Jam'))
print(pjta.select(["album","title"]))
Tabla Arrow: ids: int64
title: string
album: string
duration: string
release: int64
artist: string
type: string
filename: string
fechreg: timestamp[us]
pyarrow.Table
album: string
title: string
----
album: [["Ten","Ten","Ten","Ten"]]
title: [["Once","Even Flow","Alive","Jeremy"]]
Leer datos de una tabla en un Polars Dataframe indicando el URI de conexi贸n¶
uri = "mysql://usuario:contrase帽a@mariadb:3306/kafka"
query = "SELECT * FROM songs"
pl.read_database_uri(query=query, uri=uri)
| ids | title | album | duration | release | artist | type | filename | fechreg |
|---|---|---|---|---|---|---|---|---|
| i64 | str | str | str | i64 | str | str | str | datetime[渭s] |
| 1 | "Black Hole Sun" | "Superunknown" | "05:06" | 1994 | "Soundgarden" | "Rock" | "metal-musics-dataset.csv" | 2025-04-09 07:59:48 |
| 2 | "Smells Like Teen Spirit" | "Nevermind" | "05:01" | 1991 | "Nirvana" | "Grunge" | "metal-musics-dataset.csv" | 2025-04-09 07:59:48 |
| 3 | "Breed" | "Nevermind" | "03:03" | 1991 | "Nirvana" | "Grunge" | "metal-musics-dataset.csv" | 2025-04-09 07:59:48 |
| 4 | "Lithium" | "Nevermind" | "04:17" | 1991 | "Nirvana" | "Grunge" | "metal-musics-dataset.csv" | 2025-04-09 07:59:48 |
| 5 | "Once" | "Ten" | "03:51" | 1991 | "Pearl Jam" | "Grunge" | "metal-musics-dataset.csv" | 2025-04-09 07:59:48 |
| 6 | "Even Flow" | "Ten" | "04:53" | 1991 | "Pearl Jam" | "Grunge" | "metal-musics-dataset.csv" | 2025-04-09 07:59:48 |
| 7 | "Alive" | "Ten" | "05:40" | 1991 | "Pearl Jam" | "Grunge" | "metal-musics-dataset.csv" | 2025-04-09 07:59:48 |
| 8 | "Jeremy" | "Ten" | "05:18" | 1991 | "Pearl Jam" | "Grunge" | "metal-musics-dataset.csv" | 2025-04-09 07:59:48 |
| 9 | "Forty Six & 2" | "Aenima" | "06:04" | 1994 | "Tool" | "Metal Progresivo" | "metal-musics-dataset.csv" | 2025-04-09 07:59:48 |
| 10 | "Lateralus" | "Lateralus" | "09:24" | 2001 | "Tool" | "Metal Progresivo" | "metal-musics-dataset.csv" | 2025-04-09 07:59:48 |
Crear conexi贸n a la base de datos mediante Alchemy¶
import pandas as pd
from sqlalchemy import create_engine
# Define the connection string
username = 'usuario'
password = 'contrase帽a'
host = 'mariadb:3306'
database = 'kafka'
# Creating the engine
dbcon = create_engine(f'mariadb+mariadbconnector://{username}:{password}@{host}/{database}',pool_size=10, max_overflow=20)
Consultar la base de datos usando la conexi贸n dbcon¶
SQLAlchemy
polars.read_database() requiere una conexi贸n SQLALchemy. Por eso, normalmente es m谩s eficiente, y m谩s simple, usar polars.read_dabase_uri()
pl.read_database(query="SELECT * FROM songs", connection=dbcon)
| ids | title | album | duration | release | artist | type | filename | fechreg |
|---|---|---|---|---|---|---|---|---|
| i64 | str | str | str | i64 | str | str | str | datetime[渭s] |
| 1 | "Black Hole Sun" | "Superunknown" | "05:06" | 1994 | "Soundgarden" | "Rock" | "metal-musics-dataset.csv" | 2025-04-09 07:59:48 |
| 2 | "Smells Like Teen Spirit" | "Nevermind" | "05:01" | 1991 | "Nirvana" | "Grunge" | "metal-musics-dataset.csv" | 2025-04-09 07:59:48 |
| 3 | "Breed" | "Nevermind" | "03:03" | 1991 | "Nirvana" | "Grunge" | "metal-musics-dataset.csv" | 2025-04-09 07:59:48 |
| 4 | "Lithium" | "Nevermind" | "04:17" | 1991 | "Nirvana" | "Grunge" | "metal-musics-dataset.csv" | 2025-04-09 07:59:48 |
| 5 | "Once" | "Ten" | "03:51" | 1991 | "Pearl Jam" | "Grunge" | "metal-musics-dataset.csv" | 2025-04-09 07:59:48 |
| 6 | "Even Flow" | "Ten" | "04:53" | 1991 | "Pearl Jam" | "Grunge" | "metal-musics-dataset.csv" | 2025-04-09 07:59:48 |
| 7 | "Alive" | "Ten" | "05:40" | 1991 | "Pearl Jam" | "Grunge" | "metal-musics-dataset.csv" | 2025-04-09 07:59:48 |
| 8 | "Jeremy" | "Ten" | "05:18" | 1991 | "Pearl Jam" | "Grunge" | "metal-musics-dataset.csv" | 2025-04-09 07:59:48 |
| 9 | "Forty Six & 2" | "Aenima" | "06:04" | 1994 | "Tool" | "Metal Progresivo" | "metal-musics-dataset.csv" | 2025-04-09 07:59:48 |
| 10 | "Lateralus" | "Lateralus" | "09:24" | 2001 | "Tool" | "Metal Progresivo" | "metal-musics-dataset.csv" | 2025-04-09 07:59:48 |
Escribir en tabla de la base de datos¶
# Leer datos de archivo CSV local
dfpl = pl.read_csv("Eurostar/stations.csv")
# Volcar en la base de datos
dfpl.write_database(table_name='stations', connection=dbcon, engine='sqlalchemy', if_table_exists = "replace")