First, let's talk about software architecture. The question bank is for people to read, what to read, or open a browser, or install a client software. This is the front-end display part, which corresponds to web front-end development or client development respectively.
The advantage of Web front-end is cross-platform, as long as there is a browser, but the disadvantages are also obvious, such as browser adaptation and anti-crawler. , and the security is poor.
The advantage of the client is that it can make full use of the advantages of the operating system, and its performance will be better, for example, it can use the operating system for caching; The shortcomings are also obvious, and it cannot be cross-platform. Each platform needs to be developed separately.
Which front-end scheme to choose needs to be balanced according to the target user group.
The front-end display is solved, and then the back-end. The back-end is only used for simple storage and search, which requires the support of databases and public clouds, such as saving pictures and topics. If you want to group papers, you may need a special server to group papers.
If there are many users, it is necessary to consider concurrency and adopt technical architectures such as clustering and microservices to solve these problems.
Second, solve the problem of software architecture, and then solve the construction of the topic.
This can be broken down into the following parts:
Where does the topic come from?
How to parse the question bank into something that can be displayed in the front end?
Topic sources can buy word format from others, and there may be hundreds of pages of topic resources at a time. The other is to hire college students or teachers to do research and write their own topics. Of course, the latter method is time-consuming and inefficient, so it is generally inclined to buy.
I bought it in word format, and there may be someone else's watermark and some hidden marks. There will be a lot of formulas in math test questions (MathType format is popular and supported by many tools).
Therefore, we must first clean up and remove the marks related to intellectual property rights (including watermarking, unmarking, etc.). ).
After cleaning, you will get a clean word document of test questions, all of which are of word type, but the item bank software is generally in html format, so the second step is to divide, convert and save these word test questions one by one according to html format.
The process of segmentation and transformation will involve the processing of formulas, pictures, tables and so on.
In the process of analysis, we should be able to get some attributes about the topic, such as: region, year, difficulty, which knowledge point we belong to and so on.
Because there are many topics, they are generally made to be processed in batches. Enter a large document, and the program can execute itself in the background.
Because the specification of word is complicated, the parsing process will involve many details, and a satisfactory result can be achieved through continuous testing and modification.