Parsēt OneNote tabulas Python

Microsoft OneNote ļauj lietotājiem iekļaut strukturētas tabulas tieši lapās, kas ir ideāli piemērots uzdevumu sarakstiem, grafikiem, salīdzināšanas matricām un datu vākšanas veidlapām. Aspose.Note FOSS priekš Python ļauj programmatiski izvilkt visus šos tabulas datus, neprasot Microsoft Office instalāciju.

Instalēt

pip install aspose-note

Ielādējiet dokumentu un atrodiet tabulas

GetChildNodes(Table) veic rekursīvu meklēšanu visā dokumentā un atgriež katru tabulu kā Table objekts:

from aspose.note import Document, Table

doc = Document("MyNotes.one")
tables = doc.GetChildNodes(Table)
print(f"Found {len(tables)} table(s)")

Lasīt šūnu vērtības

Tabulas seko trīs līmeņu hierarhijai: Table → TableRow → TableCell. Katrs šūna satur RichText mezglus, kuru .Text sniedz vienkārša teksta saturu:

from aspose.note import Document, Table, TableRow, TableCell, RichText

doc = Document("MyNotes.one")

for t_num, table in enumerate(doc.GetChildNodes(Table), start=1):
    print(f"\nTable {t_num}:")
    for r_num, row in enumerate(table.GetChildNodes(TableRow), start=1):
        cells = row.GetChildNodes(TableCell)
        row_values = [
            " ".join(rt.Text for rt in cell.GetChildNodes(RichText)).strip()
            for cell in cells
        ]
        print(f"  Row {r_num}: {row_values}")

Pārbaudiet kolonnu platumus

Table.ColumnWidths atgriež katras kolonnas saglabāto platumu punktos:

from aspose.note import Document, Table

doc = Document("MyNotes.one")
for i, table in enumerate(doc.GetChildNodes(Table), start=1):
    widths = [col.Width for col in table.Columns]
    print(f"Table {i}: {len(widths)} column(s)")
    print(f"  Widths (pts): {widths}")
    print(f"  Borders visible: {table.IsBordersVisible}")

Eksportēt visas tabulas uz CSV

Pārvērst katru tabulu dokumentā CSV formātā:

import csv, io
from aspose.note import Document, Table, TableRow, TableCell, RichText

doc = Document("MyNotes.one")
output = io.StringIO()
writer = csv.writer(output)

for table in doc.GetChildNodes(Table):
    for row in table.GetChildNodes(TableRow):
        values = [
            " ".join(rt.Text for rt in cell.GetChildNodes(RichText)).strip()
            for cell in row.GetChildNodes(TableCell)
        ]
        writer.writerow(values)
    writer.writerow([])   # blank row between tables

with open("tables.csv", "w", encoding="utf-8", newline="") as f:
    f.write(output.getvalue())

print("Saved tables.csv")

Eksportēt tabulas uz Python vārdnīcu / JSON

import json
from aspose.note import Document, Table, TableRow, TableCell, RichText

doc = Document("MyNotes.one")
result = []

for table in doc.GetChildNodes(Table):
    rows = []
    for row in table.GetChildNodes(TableRow):
        cells = [
            " ".join(rt.Text for rt in cell.GetChildNodes(RichText)).strip()
            for cell in row.GetChildNodes(TableCell)
        ]
        rows.append(cells)
    result.append({"rows": rows, "column_widths": [col.Width for col in table.Columns]})

print(json.dumps(result, indent=2))

Izmantot pirmo rindu kā galvenes

from aspose.note import Document, Table, TableRow, TableCell, RichText

doc = Document("MyNotes.one")

for table in doc.GetChildNodes(Table):
    rows = table.GetChildNodes(TableRow)
    if not rows:
        continue

    def row_text(row):
        return [
            " ".join(rt.Text for rt in cell.GetChildNodes(RichText)).strip()
            for cell in row.GetChildNodes(TableCell)
        ]

    headers = row_text(rows[0])
    print("Headers:", headers)
    for row in rows[1:]:
        record = dict(zip(headers, row_text(row)))
        print("  Record:", record)

Ko bibliotēka atbalsta tabulām

ĪpašībaAtbalstīts
Table.ColumnWidthsJā: kolonnas platumi punktos
Table.BordersVisible
Table.TagsJā: OneNote birkas tabulās
Šūnas teksts caur RichText
Šūnas attēli caur Image
Apvienotas šūnas (rowspan/colspan metadati)Nav atklāts publiskajā API
Rakstīt/rediģēt tabulas un saglabāt uz .one

Nākamie soļi