Update 'README.md'
This commit is contained in:
parent
e1cdb6fb17
commit
242c77fae3
15
README.md
15
README.md
@ -2,16 +2,24 @@
|
|||||||
|
|
||||||
This is an experimental project and is not yet used as part of the OwlBoard stack.
|
This is an experimental project and is not yet used as part of the OwlBoard stack.
|
||||||
|
|
||||||
|
## Language
|
||||||
|
|
||||||
|
It is so-far undecided what language will be used. Documents for parsing are likely to be a few hundred lines long so searching may become processor intensive meaning Go may be a good candidate, however Python offers an array of libraries which coule be helpful.
|
||||||
|
|
||||||
|
## File formats
|
||||||
|
|
||||||
|
Diagrams are received in DOCX format, however can be easily be converted to ODT, DOC, or PDF which provides flexibility in the languages and the libraries used in the implementation.
|
||||||
|
|
||||||
## Aims
|
## Aims
|
||||||
|
|
||||||
The aim of diagram-parser is to simplify the addition of PIS codes that are not yet in the OwlBoard data source. The planned implementation is as follows:
|
The aim of diagram-parser is to simplify the addition of PIS codes that are not yet in the OwlBoard data source. The planned implementation is as follows:
|
||||||
|
|
||||||
- diagram-parser is subscribed to an email inbox (IMAP/POP3)
|
- diagram-parser is subscribed to an email inbox (IMAP/POP3)
|
||||||
- Formatted train-crew schedule cards are sent to the inbox (DOCX - Maybe PDF alternatively - format) and loaded by diagram-parser
|
- Formatted train-crew schedule cards are sent to the inbox and loaded by diagram-parser
|
||||||
- List of existing PIS codes is loaded and a list of non-existent codes is compiled (0000-9999)
|
- List of existing PIS codes is loaded and a list of non-existent codes is compiled (0000-9999)
|
||||||
- If a code is found both in the diagram and on the list of non-existent codes, a Gitea issue is opened providing details of the code.
|
- If a code is found both in the diagram and on the list of non-existent codes, a Gitea issue is opened providing details of the code.
|
||||||
- Once the program has run and extracted only the relavent details, the email is deleted and the file is closed and not stored.
|
- Once the program has run and extracted only the relavent details, the email is deleted and the file is closed and not stored.
|
||||||
- The evantual aim is to avoid any manual searching of the DOCX files.
|
- The evantual aim is to avoid any manual searching of the files.
|
||||||
|
|
||||||
The current process of adding new codes involves being made aware of them face to face, or finding them myself and manually finding and adding them to the data source.
|
The current process of adding new codes involves being made aware of them face to face, or finding them myself and manually finding and adding them to the data source.
|
||||||
|
|
||||||
@ -23,4 +31,5 @@ The current process of adding new codes involves being made aware of them face t
|
|||||||
- The format of the attachment should be checked and any errors handled gracefully.
|
- The format of the attachment should be checked and any errors handled gracefully.
|
||||||
|
|
||||||
## Main external dependencies (Expected)
|
## Main external dependencies (Expected)
|
||||||
- mailbox (https://pypi.org/project/mailbox/)
|
- imaplib
|
||||||
|
- email
|
||||||
|
Reference in New Issue
Block a user