Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

move parsing functions #13

Merged
merged 32 commits into from
Mar 12, 2024
Merged

move parsing functions #13

merged 32 commits into from
Mar 12, 2024

Conversation

DeltaDaniel
Copy link
Contributor

No description provided.

@DeltaDaniel DeltaDaniel mentioned this pull request Mar 6, 2024
@DeltaDaniel DeltaDaniel marked this pull request as ready for review March 6, 2024 18:20
Copy link
Contributor

@hf-krechan hf-krechan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ich vermute dass wir noch die mypy settings anpassen müssen.
Da sind verdächtig viele variablen nicht mit einem type hint verstehen.
Können das aber auch in einem extra PR machen.

Sonst passt es soweit. Habe nur viele Kleinigkeiten.

@@ -1,2 +1,3 @@
# specific requirements for the tox coverage env
coverage
pytest_loguru
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uh nice, was ist es, was tut es?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from maus.edifact import EdifactFormat


def find_file_to_type(message_types: list[EdifactFormat], input_dir: Path) -> dict[EdifactFormat, Path]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

statt type würde ich lieber bei dem fachlichen wording bleiben. Also eher format oder EdifactFormat
Das dann in dem Funktionsnamen und dem parameter message_type anpassen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"""
finds the file with the message type in the input directory
"""
file_dict = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

beschwert sich mypy hier nicht über den fehlenden typehint ^^?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finde ich gerade selber komisch, aber nein 🤔

logger.warning(f"⚠️ No file found for {message_type}.", fg="red")
if file_dict:
return file_dict
logger.error("⚠️ No files found in the input directory.", fg="red")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finde die emoticons mega gut. Würde nur für errors ein anders emoticon nehmen, bspw. ❌

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raise click.Abort()


def preliminary_output_as_json(table: list[str], message_type: EdifactFormat, output_dir: Path) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def preliminary_output_as_json(table: list[str], message_type: EdifactFormat, output_dir: Path) -> None:
def preliminary_output_as_json(table: list[str], message_format: EdifactFormat, output_dir: Path) -> None:

oder

Suggested change
def preliminary_output_as_json(table: list[str], message_type: EdifactFormat, output_dir: Path) -> None:
def preliminary_output_as_json(table: list[str], edifact_format: EdifactFormat, output_dir: Path) -> None:

im Deutschen heißt es Nachrichtenformat oder EDIFACT-Format.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 44 to 45
if not output_dir.exists():
output_dir.mkdir(parents=True, exist_ok=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

das kannst du vereinfachen

Suggested change
if not output_dir.exists():
output_dir.mkdir(parents=True, exist_ok=True)
output_dir.mkdir(parents=True, exist_ok=True)

das exist_ok hilft dir genau dabei.

will not raise an exception if the directory already exists (exist_ok=True)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raise click.Abort()


def preliminary_output_as_json(table: list[str], message_type: EdifactFormat, output_dir: Path) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wieso ist die table nur vom typ list[str]?
Hätte gedacht dass es eins von unseren Klassen ist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hier ist das wirklich nur ein einfaches Outputframework. Die klassenspezifischen Methoden kommen später. :-)


def parse_raw_nachrichtenstrukturzeile(input_path: Path) -> list[str]:
"""
parses raw nachrichtenstrukturzeile from a table. returns list of raw lines
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
parses raw nachrichtenstrukturzeile from a table. returns list of raw lines
Parses raw nachrichtenstrukturzeile from a table. Returns list of raw lines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 20 to 47
def test_find_file_to_type(self):
message_type = [EdifactFormat.ORDCHG]
input_dir = Path("unittests/test_data/")
file_dict = find_file_to_type(message_type, input_dir)
assert file_dict[EdifactFormat.ORDCHG] == input_dir / Path("ORDCHG_MIG_1_1_info_20230331_v2.docx")

def test_find_only_one_file(self, caplog):
message_type = [EdifactFormat.ORDCHG, EdifactFormat.ORDRSP]
input_dir = Path("unittests/test_data/")
with caplog.at_level(logging.WARNING):
file_dict = find_file_to_type(message_type, input_dir)
assert f"No file found for {EdifactFormat.ORDRSP}." in caplog.text
assert file_dict[EdifactFormat.ORDCHG] == input_dir / Path("ORDCHG_MIG_1_1_info_20230331_v2.docx")

def test_parse_raw_nachrichtenstrukturzeile(self):

input_file = Path("unittests/test_data/ORDCHG_MIG_1_1_info_20230331_v2.docx")
mig_table = parse_raw_nachrichtenstrukturzeile(input_file)
assert len(mig_table) == 18
assert "Nachrichten-Kopfsegment" in mig_table[0]
assert "Nachrichten-Endesegment" in mig_table[-1]

def test_preliminary_output_as_json(self, tmp_path):
table = ["line1", "line2", "line3"]
message_type = EdifactFormat.ORDCHG
output_dir = tmp_path / Path("output")

preliminary_output_as_json(table, message_type, output_dir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kannst du den tests noch einen docstring verpassen damit man grob weiß was getestet wird?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DeltaDaniel DeltaDaniel merged commit c76255a into main Mar 12, 2024
15 checks passed
@DeltaDaniel DeltaDaniel deleted the DDB/move_parsing_functions branch March 12, 2024 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants