Current location - Training Enrollment Network - Books and materials - How should XPATH be written?
How should XPATH be written?
describe

Nodename selects all children of this node.

/Select from the root node.

//Select the nodes in the document that match the selection from the current nodes, regardless of their positions.

. Select the current node.

.. select the parent node of the current node.

@ Select an attribute.

Routing expression

result

Bookstore selects all the child nodes of the Bookstore element.

/bookstore Select the root element bookstore. Note: If the path starts with a forward slash (/), it always represents the absolute path of an element!

Bookstore/Book Select all the book elements that belong to the sub-elements of the bookstore.

//book selects all book child elements, regardless of their position in the document.

Book store//books selects all the child elements of the book element, no matter where they are located in the bookstore.

Named lang selects all attributes named lang.

for instance

1. Find page root element://

2. Find all input elements on the page: //input

3. Find the direct child input element in the first form element on the page (that is, the next-level input element containing only the form element, which is represented by an absolute path and represented by a single symbol): //form[ 1]/input.

4. Find all the child input elements in the first form element on the page (as long as the input in the form element is calculated, no matter how many other tabs are nested, it is represented by a relative path with a double//sign)://form [1]/input.

5. Find the first form element on the page: //form[ 1]

6. Find the form element with the id of loginForm on the page: //form[@id='loginForm'].

7. Find the input element whose name attribute is username on the page: //input[@name='username'].

8. On the page, find the first input element under the form element with the id of loginForm://form [@ id =' login form']/input [1].

9. The search page has an input element whose name attribute is continuous and whose type attribute is button://input [@ name =' continue'] [@ type =' button'].

10. Find all the elements with id in the web page? /@id

2. Decorate the content searched by the node.

for instance

Routing expression

result

/bookstore/book[ 1] Select the first book element that belongs to the child element of the bookstore.

/bookstore/book[last()] Select the last book element that belongs to the bookstore child element.

/book store/book [last ()-1] Select the penultimate Book element that belongs to the child element of Bookstore.

/bookstore/book[position()] Select the first two book elements that belong to the child elements of the bookstore element.

//title[@lang] Select all the title elements whose attribute name is lang.

//title[@lang='eng'] Select all the title elements, which have the lang attribute with the value of eng.

/Bookstore/Book [Price & gt35.00] Select all book elements of the bookstore element, and the value of the price element must be greater than 35.00.

/bookstore/book [price & gt35.00]/title Select all title elements of the Book element in the Book element, and the value of the Price element must be greater than 35.00.

3. Select an unknown node

wildcard character

describe

* Matches any element node.

@ * Matches any attribute node.

Node () matches any type of node.

for instance

Routing expression

result

/bookstore/* Select all the child elements of the bookstore element.

//* Select all elements in the document.

//title[@*] Select all title elements with attributes.

4. Choose several paths

You can select multiple paths by using the | operator in the path expression.

Routing expression

result

//book/title | //book/price Select all the titles and price elements of the book element.

//title | //price Select all the title and price elements in the document.

/bookstore/book/title | //price Select all title elements of the book element belonging to the bookstore element and all price elements in the document.

5. Keywords

Use case

for instance

Text () Book/Author/Text ()

String () Book/Author /string ()

Data () Books/Authors/Data ()

. A book/author/.

for instance

XML example

& ltbook & gt& lt author & gt Tom & lt/em & gt; John & lt/em > Cat</ Author & gt& lt Pricing & gt& lt Price & gt20 & lt/price & gt;; & lt discount & gt0.8 & lt/discount & gt;; & lt/pricing & gt; & lt/book & gt;

Text ()

You often see text () at the end of XPath expressions, which only returns the text content of the specified element.

The crawled xpath format is book/author/text (), and the crawled content is Tom cat, where John does not belong to the author's direct node content.

String ()

The string () function will get all the node text contents of the specified element, which will be spliced into a string.

The crawled xpath format is book/author/string (), and the crawled content is crawled from the head to the tail of Tom John Cat.

Data ()

Most of the time, the data () function and the string () function are commonly used, and frequent use of the data () function is not recommended. According to statistics, this function will affect the performance of XPath.

The crawled xpath format is book/pricing/data (), and the crawled content returns separated 20 and 0.8. Their types are not strings, but xs:anyAtomicType, so you can use mathematical functions to perform some operations.

You can only use data () when crawling all numbers, but you can't use text () or string () because XPath doesn't support strings for mathematical operations.

Author: little salted fish YYY

Source: blogs.com/pythonywy/p/11082153.html.

About the author: No matter how long the road is, it will come out step by step. No matter how short the road is, it is impossible to walk without taking a step.

Signature of this work-non-commercial use-no interpretation of international version 4.0? Permission, please indicate the author and source.

Classification:? Reptiles are good at writing? Pay attention to my collection of this article, little salted fish YYY.

Focus -4

Fans -302+ plus attention 00 Previous:? Descriptor \ Get/Set/Delete, Initialize/Create/Call, Metaclass

Next:? Network framework, internet composition, OSI seven-layer protocol, abstract layer paste @ Little salted fish YYY? Reading (1584)? Note (3) Edit Collection

Comment list # 1 Lou 20 19-06-25 13:26? Thank you for your support (0)? Objection (0)#2 Floor 20 19-06-25 13:36? Amazing 2 Thanks for your support (0)? Objection (0)# 3 [Landlord]? 20 19-06-25 14:07? Little salted fish YwY@ Jing Jing er Zuo

You're welcome to support (0)? Objection (0) Refresh the comment refresh page, and return to the top registered user to log in before commenting. Please. Login? Or? Register? Visit? Website home page. I suggest getting to know you better. Blog Park launched a questionnaire survey to help the community upgrade.

Recommend more than 500,000 lines of VC++ source code: large-scale configuration industrial control, power simulation CAD, GIS source code library.

Recommended open download! Basic practical manual for OSS operation and maintenance

personal information

The process of building programs is essentially the process of debugging specifications-click to view the photos of bloggers' lives. 568972484

Wechat:? YwYbetheone

Personal blog:? Mr. Yang's blog

Personal music website:? Aegean music

Radio:? Proficient in python reptiles for two minutes every day. Xiaoxiao salted fish YwY

Garden age:? 1 year and 2 months

Fans:? 302

Concern:? 4+ attention

& lt July 2020 >

sun

one

two

three

four

five

six

28 29 30 1 2 3 4

5 6 7 8 9 10 1 1

12 13 14 15 16 17 18

19 20 2 1 22 23 24 25

26 27 28 29 30 3 1 1

2 3 4 5 6 7 8

My label

Drf framework (15)

Vue-CLI( 13)

E-commerce related crawling (6)

Forum (6)

Hook frame Frida (5)

Tornado (4)

Appendix (3)

Java(3)

Git detailed operation (3)

Timing tasks and asynchronous tasks (3)

more

Points and rankings

Integer-1908 14

Rank -29 15

Composition classification? (572)

Jiang Ge (6 1)

Flasks (16)

github(9)

Go (17)

Jupyter notebook (1)

linux(20)

Python learning diary (1 16)

Shell (1)

Taipola (2)

vs( 1)

Vs self-study diary (7)

Vue(26)

Concurrent Programming (8)

Blog Garden (10)

Personal Blog Construction (6)

Mandatory button question bank (22)

Reptiles (127)

The front end (50)

Database (22)

Wechat applet (1 1)

Applet (22)

Exception (17)

Paper files? (494)

July 2020 (8)

June 2020 (14)

May 2020 (4)

April 2020 (9)

March 2020 (10)

February 2020 (5)

65438+October 2020 (10)

20 19 12 ( 13)

20 19 165438+ October (49)

20 19 10 (78)

2065438+September 2009 (76)

2065438+August 2009 (74)

2065438+July 2009 (48)

2065438+June 2009 (4 1)

2065438+May 2009 (48)

April 20 19 (7)

Latest comments

1. Reply: Blog Garden Beautifies Little Rocket

thank you

Peter William

2. Reply: The front end realizes all the ways of downloading files.

Cool. . . . . . . . . . . . . . . . . . . . . . . . . . . .

-Xiaobaotao

3. About: jwt authentication and custom jwt authentication in DRF framework.

Hi, A Liang has seen the video. ...

-Little salted fish YYY

4. About: jwt authentication and custom jwt authentication in DRF framework.

Reading my brother's blog should also be old boys's brother. This article is really detailed.

Hi, A Liang.

5. Reply: Reptiles

Xiao Zailong ...

-Little salted fish YYY

6. reply: reptile finishing

Boss, I've been studying the cracking method of limit verification code for some time, but the slider always deviates, and it's solved.

-Little Jae-Ryong

7. Reply: python diary arrangement

@ 17 Index Thank you ...

-Little salted fish YYY

8. Reply: python diary arrangement

strong

-Seventeen index

9.Re:GO Language Introduction and Development Environment Configuration

I have studied you and paid attention to you.

-Seventeen index

10.re: parsel module for Python crawler web page parsing.

My name is Liu Xiaohua. What's the password? ...

-Little salted fish YYY

Reading leaderboard

1.python crawler (grabbing pictures) (16036)

2.python Crawler (Capture Video) (13072)

3.python- Crawler Learning Directory (4 164)

4.django generates model classes according to existing database tables (3446)

5. Monty Python Diary Arrangement (3222)

6. parsel module for 6.Python crawler web page parsing (3084)

7. Scroll association in 7.JS (2906)

8. Panda module (detailed classification), pd.concat (subsequent supplement) (2884)

9. Response Attribute and Content Extraction in 9.Scrapy (2799)

10.Python3 installs a small pit using the urllib2 package (1933).

Copyright? 2020 xiaoxiao salted fish YYY

Power comes from. NET Core on Kubernetes