move parsing functions #13

DeltaDaniel · 2024-03-06T13:43:31Z

No description provided.

due to issues with tox

Co-authored-by: kevin <68426071+hf-krechan@users.noreply.github.com>

Co-authored-by: konstantin <konstantin.klein@hochfrequenz.de>

hf-krechan

Ich vermute dass wir noch die mypy settings anpassen müssen.
Da sind verdächtig viele variablen nicht mit einem type hint verstehen.
Können das aber auch in einem extra PR machen.

Sonst passt es soweit. Habe nur viele Kleinigkeiten.

hf-krechan · 2024-03-12T10:13:11Z

dev_requirements/requirements-coverage.in

@@ -1,2 +1,3 @@
 # specific requirements for the tox coverage env
 coverage
+pytest_loguru


uh nice, was ist es, was tut es?

hf-krechan · 2024-03-12T10:22:23Z

src/migmose/parsing.py

+from maus.edifact import EdifactFormat
+
+
+def find_file_to_type(message_types: list[EdifactFormat], input_dir: Path) -> dict[EdifactFormat, Path]:


statt type würde ich lieber bei dem fachlichen wording bleiben. Also eher format oder EdifactFormat
Das dann in dem Funktionsnamen und dem parameter message_type anpassen.

hf-krechan · 2024-03-12T10:41:01Z

src/migmose/parsing.py

+    """
+    finds the file with the message type in the input directory
+    """
+    file_dict = {}


beschwert sich mypy hier nicht über den fehlenden typehint ^^?

Finde ich gerade selber komisch, aber nein 🤔

hf-krechan · 2024-03-12T10:42:16Z

src/migmose/parsing.py

+            logger.warning(f"⚠️ No file found for {message_type}.", fg="red")
+    if file_dict:
+        return file_dict
+    logger.error("⚠️ No files found in the input directory.", fg="red")


Finde die emoticons mega gut. Würde nur für errors ein anders emoticon nehmen, bspw. ❌

hf-krechan · 2024-03-12T10:44:05Z

src/migmose/parsing.py

+    raise click.Abort()
+
+
+def preliminary_output_as_json(table: list[str], message_type: EdifactFormat, output_dir: Path) -> None:


Suggested change

def preliminary_output_as_json(table: list[str], message_type: EdifactFormat, output_dir: Path) -> None:

def preliminary_output_as_json(table: list[str], message_format: EdifactFormat, output_dir: Path) -> None:

oder

Suggested change

def preliminary_output_as_json(table: list[str], message_type: EdifactFormat, output_dir: Path) -> None:

def preliminary_output_as_json(table: list[str], edifact_format: EdifactFormat, output_dir: Path) -> None:

im Deutschen heißt es Nachrichtenformat oder EDIFACT-Format.

hf-krechan · 2024-03-12T10:51:44Z

src/migmose/parsing.py

+    if not output_dir.exists():
+        output_dir.mkdir(parents=True, exist_ok=True)


das kannst du vereinfachen

Suggested change

if not output_dir.exists():

output_dir.mkdir(parents=True, exist_ok=True)

output_dir.mkdir(parents=True, exist_ok=True)

das exist_ok hilft dir genau dabei.

will not raise an exception if the directory already exists (exist_ok=True)

hf-krechan · 2024-03-12T10:55:16Z

src/migmose/parsing.py

+    raise click.Abort()
+
+
+def preliminary_output_as_json(table: list[str], message_type: EdifactFormat, output_dir: Path) -> None:


wieso ist die table nur vom typ list[str]?
Hätte gedacht dass es eins von unseren Klassen ist.

Hier ist das wirklich nur ein einfaches Outputframework. Die klassenspezifischen Methoden kommen später. :-)

hf-krechan · 2024-03-12T10:56:04Z

src/migmose/parsing.py

+
+def parse_raw_nachrichtenstrukturzeile(input_path: Path) -> list[str]:
+    """
+    parses raw nachrichtenstrukturzeile from a table. returns list of raw lines


Suggested change

parses raw nachrichtenstrukturzeile from a table. returns list of raw lines

Parses raw nachrichtenstrukturzeile from a table. Returns list of raw lines.

hf-krechan · 2024-03-12T10:58:28Z

unittests/test_parsing.py

+    def test_find_file_to_type(self):
+        message_type = [EdifactFormat.ORDCHG]
+        input_dir = Path("unittests/test_data/")
+        file_dict = find_file_to_type(message_type, input_dir)
+        assert file_dict[EdifactFormat.ORDCHG] == input_dir / Path("ORDCHG_MIG_1_1_info_20230331_v2.docx")
+
+    def test_find_only_one_file(self, caplog):
+        message_type = [EdifactFormat.ORDCHG, EdifactFormat.ORDRSP]
+        input_dir = Path("unittests/test_data/")
+        with caplog.at_level(logging.WARNING):
+            file_dict = find_file_to_type(message_type, input_dir)
+            assert f"No file found for {EdifactFormat.ORDRSP}." in caplog.text
+            assert file_dict[EdifactFormat.ORDCHG] == input_dir / Path("ORDCHG_MIG_1_1_info_20230331_v2.docx")
+
+    def test_parse_raw_nachrichtenstrukturzeile(self):
+
+        input_file = Path("unittests/test_data/ORDCHG_MIG_1_1_info_20230331_v2.docx")
+        mig_table = parse_raw_nachrichtenstrukturzeile(input_file)
+        assert len(mig_table) == 18
+        assert "Nachrichten-Kopfsegment" in mig_table[0]
+        assert "Nachrichten-Endesegment" in mig_table[-1]
+
+    def test_preliminary_output_as_json(self, tmp_path):
+        table = ["line1", "line2", "line3"]
+        message_type = EdifactFormat.ORDCHG
+        output_dir = tmp_path / Path("output")
+
+        preliminary_output_as_json(table, message_type, output_dir)


kannst du den tests noch einen docstring verpassen damit man grob weiß was getestet wird?

…tions

DeltaDaniel and others added 23 commits February 27, 2024 14:05

📍pin virtualenv version in pyproject.toml

93696df

due to issues with tox

init datamodell

8babf02

fix minor issue

d2dd314

spell_check

b633c2b

updated pre-commit hooks

b18cc4a

black

c35b4d9

add logger

a3b95f0

read line by line

7004055

Update pyproject.toml

b7d7eb4

Co-authored-by: kevin <68426071+hf-krechan@users.noreply.github.com>

refined line parser

848b511

Merge remote-tracking branch 'origin/main' into DDB/add_logger

0b446b7

Merge remote-tracking branch 'origin/main' into DDB/add_logger

fc63826

changed logger to loguru

f977e95

added examples to some docstrings

76cd270

Merge branch 'DDB/add_logger' into DDB/read_docx

09e83f0

removed unused code

f54f4ac

added CLI and input/output

a7cb35f

added json output

5178c4f

fixed linting, testing, etc, issues :-)

171c193

Merge remote-tracking branch 'origin/main' into DDB/add_FH_CLI

c8d75af

Update pyproject.toml

fb96d60

Co-authored-by: konstantin <konstantin.klein@hochfrequenz.de>

Update src/migmose/__main__.py

8e44a9d

Co-authored-by: konstantin <konstantin.klein@hochfrequenz.de>

message_type from maus.edifact.EdifactFormat

a97ca36

DeltaDaniel mentioned this pull request Mar 6, 2024

Add CLI #11

Merged

DeltaDaniel added 4 commits March 6, 2024 15:02

moved parsing functions, added test for find_file_to_type function

9042250

added tests for parsing module

406d29e

Merge branch 'main' into DDB/move_parsing_functions

0f461fe

black

5df6746

DeltaDaniel marked this pull request as ready for review March 6, 2024 18:20

hf-krechan approved these changes Mar 12, 2024

View reviewed changes

DeltaDaniel added 5 commits March 12, 2024 13:14

renamed NachrichtenTYPE -> NachrichtenFORMAT

109bcb4

refined documentation

4bbdc79

simplified preliminary_output_as_json function

62978f0

Merge remote-tracking branch 'origin/main' into DDB/move_parsing_func…

08d97ed

…tions

after merge requirements-compile

6e6c646

DeltaDaniel merged commit c76255a into main Mar 12, 2024
15 checks passed

DeltaDaniel deleted the DDB/move_parsing_functions branch March 12, 2024 13:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

move parsing functions #13

move parsing functions #13

DeltaDaniel commented Mar 6, 2024

hf-krechan left a comment

hf-krechan Mar 12, 2024

DeltaDaniel Mar 12, 2024

hf-krechan Mar 12, 2024

DeltaDaniel Mar 12, 2024

hf-krechan Mar 12, 2024

DeltaDaniel Mar 12, 2024

hf-krechan Mar 12, 2024

DeltaDaniel Mar 12, 2024

hf-krechan Mar 12, 2024

DeltaDaniel Mar 12, 2024

hf-krechan Mar 12, 2024

DeltaDaniel Mar 12, 2024

hf-krechan Mar 12, 2024

DeltaDaniel Mar 12, 2024

hf-krechan Mar 12, 2024

DeltaDaniel Mar 12, 2024

hf-krechan Mar 12, 2024

DeltaDaniel Mar 12, 2024

		from maus.edifact import EdifactFormat


		def find_file_to_type(message_types: list[EdifactFormat], input_dir: Path) -> dict[EdifactFormat, Path]:

		raise click.Abort()


		def preliminary_output_as_json(table: list[str], message_type: EdifactFormat, output_dir: Path) -> None:

		if not output_dir.exists():
		output_dir.mkdir(parents=True, exist_ok=True)

	parses raw nachrichtenstrukturzeile from a table. returns list of raw lines
	Parses raw nachrichtenstrukturzeile from a table. Returns list of raw lines.

move parsing functions #13

move parsing functions #13

Conversation

DeltaDaniel commented Mar 6, 2024

hf-krechan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment