"However, the first-order theory is simply a bunch of symbol strings. I feel that such axiomatic set theory is actually studying this bunch of symbol strings, not what we usually call' set'."
Yes, first-order theory is indeed a string of symbols, and set theory is indeed studying these symbol strings. In fact, this is not surprising, because mathematics is like this in modern times. In a sense, any mathematics is just studying symbol strings. Just like modern geometry books can't even find a picture, because we study symbol strings, not the usual "points, lines and planes". Simply studying "set" intuitively may lead to very disastrous results, such as Cantor's paradox and Russell's paradox, which is exactly why we need axiomatic set theory.
Regarding the third question, there is indeed some trouble. There seems to be a "circular definition" between first-order logic and set theory. However, this is not completely inevitable, which requires us to introduce the concept of a class, which is equivalent to what we call an "intuitive" set. For example, "all sets" cannot form a set, but can form a class. Then we use the concept of "class" in first-order logic. Note that "class" is a natural language thing. But if ordinary first-order logic textbooks don't particularly emphasize its application in set theory, why bother using sets directly? ...