Qt
Internal/Contributor docs for the Qt SDK. Note: These are NOT official API docs; those are found at https://doc.qt.io/
Loading...
Searching...
No Matches
qstringtokenizer.cpp
Go to the documentation of this file.
1
// Copyright (C) 2020 Klarälvdalens Datakonsult AB, a KDAB Group company, info@kdab.com, author Marc Mutz <marc.mutz@kdab.com>
2
// SPDX-License-Identifier: LicenseRef-Qt-Commercial OR LGPL-3.0-only OR GPL-2.0-only OR GPL-3.0-only
3
// Qt-Security score:significant reason:docs-only
4
5
#
include
"qstringtokenizer.h"
6
#
include
"qstringalgorithms.h"
7
8
QT_BEGIN_NAMESPACE
9
10
/*!
11
\class QStringTokenizer
12
\inmodule QtCore
13
\since 6.0
14
\brief The QStringTokenizer class splits strings into tokens along given separators.
15
\reentrant
16
\ingroup tools
17
\ingroup string-processing
18
19
QStringTokenizer<Haystack, Needle> is a template class where \a Haystack
20
is the type of the string being tokenized and \a Needle is the type of the
21
separator. In practice, you should never need to specify these template
22
arguments explicitly; they are deduced automatically by the compiler.
23
24
Splits a string into substrings wherever a given separator occurs,
25
returning a (lazily constructed) list of those strings. If the separator does
26
not match anywhere in the string, produces a single-element list
27
containing this string. If the separator is empty,
28
QStringTokenizer produces an empty string, followed by each of the
29
string's characters, followed by another empty string. The two
30
enumerations Qt::SplitBehavior and Qt::CaseSensitivity further
31
control the output.
32
33
QStringTokenizer drives QStringView::tokenize(), but you can use it
34
directly, too:
35
36
\code
37
for (auto it : QStringTokenizer{string, separator})
38
use(*it);
39
\endcode
40
41
\note You should never name the template arguments of a
42
QStringTokenizer explicitly. You may write
43
\c{QStringTokenizer{string, separator}} (without template arguments),
44
or use either QStringView::tokenize() or QLatin1StringView::tokenize(),
45
then store the return value only in an \c{auto} variable:
46
47
\code
48
auto result = strview.tokenize(sep);
49
\endcode
50
51
This is because the template arguments of QStringTokenizer have a
52
very subtle dependency on the specific string and separator types
53
from with which they are constructed, and they don't usually
54
correspond to the actual types passed.
55
56
\section1 Lazy Sequences
57
58
QStringTokenizer acts as a so-called lazy sequence, that is, each
59
next element is only computed once you ask for it. Lazy sequences
60
have the advantage that they only require O(1) memory. They have
61
the disadvantage that, at least for QStringTokenizer, they only
62
allow forward, not random-access, iteration.
63
64
The intended use-case is that you just plug it into a ranged for loop:
65
66
\code
67
for (auto it : QStringTokenizer{string, separator})
68
use(*it);
69
\endcode
70
71
or a C++20 ranged algorithm:
72
73
\code
74
std::ranges::for_each(QStringTokenizer{string, separator},
75
[] (auto token) { use(token); });
76
\endcode
77
78
\section1 End Sentinel
79
80
The QStringTokenizer iterators cannot be used with classical STL
81
algorithms, because those require iterator/iterator pairs, while
82
QStringTokenizer uses sentinels. That is, it uses a different
83
type, QStringTokenizer::sentinel, to mark the end of the
84
range. This improves performance, because the sentinel is an empty
85
type. Sentinels are supported from C++17 (for ranged for)
86
and C++20 (for algorithms using the new ranges library).
87
88
\section1 Temporaries
89
90
QStringTokenizer is very carefully designed to avoid dangling
91
references. If you construct a tokenizer from a temporary string
92
(an rvalue), that argument is stored internally, so the referenced
93
data isn't deleted before it is tokenized:
94
95
\code
96
auto tok = QStringTokenizer{widget.text(), u','};
97
// return value of `widget.text()` is destroyed, but content was moved into `tok`
98
for (auto e : tok)
99
use(e);
100
\endcode
101
102
If you pass named objects (lvalues), then QStringTokenizer does
103
not store a copy. You are responsible to keep the named object's
104
data around for longer than the tokenizer operates on it:
105
106
\code
107
auto text = widget.text();
108
auto tok = QStringTokenizer{text, u','};
109
text.clear(); // destroy content of `text`
110
for (auto e : tok) // ERROR: `tok` references deleted data!
111
use(e);
112
\endcode
113
114
\sa QStringView::split(), QString::split(), QRegularExpression
115
*/
116
117
/*!
118
\typealias QStringTokenizer::value_type
119
120
Alias for \c{const QStringView} or \c{const QLatin1StringView},
121
depending on the tokenizer's \c Haystack template argument.
122
*/
123
124
/*!
125
\typealias QStringTokenizer::difference_type
126
127
Alias for qsizetype.
128
*/
129
130
/*!
131
\typealias QStringTokenizer::size_type
132
133
Alias for qsizetype.
134
*/
135
136
/*!
137
\typealias QStringTokenizer::reference
138
139
Alias for \c{value_type &}.
140
141
QStringTokenizer does not support mutable references, so this is
142
the same as const_reference.
143
*/
144
145
/*!
146
\typealias QStringTokenizer::const_reference
147
148
Alias for \c{value_type &}.
149
*/
150
151
/*!
152
\typealias QStringTokenizer::pointer
153
154
Alias for \c{value_type *}.
155
156
QStringTokenizer does not support mutable iterators, so this is
157
the same as const_pointer.
158
*/
159
160
/*!
161
\typealias QStringTokenizer::const_pointer
162
163
Alias for \c{value_type *}.
164
*/
165
166
/*!
167
\typealias QStringTokenizer::iterator
168
169
This typedef provides an STL-style const iterator for
170
QStringTokenizer.
171
172
QStringTokenizer does not support mutable iterators, so this is
173
the same as const_iterator.
174
175
\sa const_iterator
176
*/
177
178
/*!
179
\typedef QStringTokenizer::const_iterator
180
181
This typedef provides an STL-style const iterator for
182
QStringTokenizer.
183
184
\sa iterator
185
*/
186
187
/*!
188
\typealias QStringTokenizer::sentinel
189
190
This typedef provides an STL-style sentinel for
191
QStringTokenizer::iterator and QStringTokenizer::const_iterator.
192
193
\sa const_iterator
194
*/
195
196
/*!
197
\fn template <typename Haystack, typename Needle> QStringTokenizer<Haystack, Needle>::QStringTokenizer(Haystack haystack, Needle needle, Qt::CaseSensitivity cs, Qt::SplitBehavior sb)
198
\fn template <typename Haystack, typename Needle> QStringTokenizer<Haystack, Needle>::QStringTokenizer(Haystack haystack, Needle needle, Qt::SplitBehavior sb, Qt::CaseSensitivity cs)
199
200
Constructs a string tokenizer that splits the string \a haystack
201
into substrings wherever \a needle occurs, and allows iteration
202
over those strings as they are found. If \a needle does not match
203
anywhere in \a haystack, a single element containing \a haystack
204
is produced.
205
206
\a cs specifies whether \a needle should be matched case
207
sensitively or case insensitively.
208
209
If \a sb is Qt::SkipEmptyParts, empty entries don't
210
appear in the result. By default, empty entries are included.
211
212
\sa QStringView::split(), QString::split(), Qt::CaseSensitivity, Qt::SplitBehavior
213
*/
214
215
/*!
216
\fn template <typename Haystack, typename Needle> QStringTokenizer<Haystack, Needle>::iterator QStringTokenizer<Haystack, Needle>::begin() const
217
\fn template <typename Haystack, typename Needle> QStringTokenizer<Haystack, Needle>::iterator QStringTokenizer<Haystack, Needle>::cbegin() const
218
219
Returns a const \l{STL-style iterators}{STL-style iterator}
220
pointing to the first token in the list.
221
222
\sa end(), cend()
223
*/
224
225
/*!
226
\fn template <typename Haystack, typename Needle> QStringTokenizer<Haystack, Needle>::sentinel QStringTokenizer<Haystack, Needle>::end() const
227
228
Returns a const \l{STL-style iterators}{STL-style sentinel}
229
pointing to the imaginary token after the last token in the list.
230
231
\sa begin(), cend()
232
*/
233
234
/*!
235
\fn template <typename Haystack, typename Needle> QStringTokenizer<Haystack, Needle>::sentinel QStringTokenizer<Haystack, Needle>::cend() const
236
237
Same as end().
238
239
\sa cbegin(), end()
240
*/
241
242
/*!
243
\fn template <typename Haystack, typename Needle> template<typename LContainer> LContainer QStringTokenizer<Haystack, Needle>::toContainer(LContainer &&c) const &
244
245
Converts the lazy sequence into a (typically) random-access container of
246
type \c LContainer.
247
248
This function is only available if \c Container has a \c value_type
249
matching this tokenizer's value_type.
250
251
If you pass in a named container (an lvalue) for \a c, then that container
252
is filled, and a reference to it is returned. If you pass in a temporary
253
container (an rvalue, incl. the default argument), then that container is
254
filled, and returned by value.
255
256
\code
257
// assuming tok's value_type is QStringView, then...
258
auto tok = QStringTokenizer{~~~};
259
// ... rac1 is a QList:
260
auto rac1 = tok.toContainer();
261
// ... rac2 is std::pmr::vector<QStringView>:
262
auto rac2 = tok.toContainer<std::pmr::vector<QStringView>>();
263
auto rac3 = QVarLengthArray<QStringView, 12>{};
264
// appends the token sequence produced by tok to rac3
265
// and returns a reference to rac3 (which we ignore here):
266
tok.toContainer(rac3);
267
\endcode
268
269
This gives you maximum flexibility in how you want the sequence to
270
be stored.
271
*/
272
273
/*!
274
\fn template <typename Haystack, typename Needle> template<typename RContainer> RContainer QStringTokenizer<Haystack, Needle>::toContainer(RContainer &&c) const &&
275
\overload
276
277
Converts the lazy sequence into a (typically) random-access container of
278
type \c RContainer.
279
280
In addition to the constraints on the lvalue-this overload, this
281
rvalue-this overload is only available when this QStringTokenizer
282
does not store the haystack internally, as this could create a
283
container full of dangling references:
284
285
\code
286
auto tokens = QStringTokenizer{widget.text(), u','}.toContainer();
287
// ERROR: cannot call toContainer() on rvalue
288
// 'tokens' references the data of the copy of widget.text()
289
// stored inside the QStringTokenizer, which has since been deleted
290
\endcode
291
292
To fix, store the QStringTokenizer in a temporary:
293
294
\code
295
auto tokenizer = QStringTokenizer{widget.text90, u','};
296
auto tokens = tokenizer.toContainer();
297
// OK: the copy of widget.text() stored in 'tokenizer' keeps the data
298
// referenced by 'tokens' alive.
299
\endcode
300
301
You can force this function into existence by passing a view instead:
302
303
\code
304
func(QStringTokenizer{QStringView{widget.text()}, u','}.toContainer());
305
// OK: compiler keeps widget.text() around until after func() has executed
306
\endcode
307
308
If you pass in a named container (an lvalue)for \a c, then that container
309
is filled, and a reference to it is returned. If you pass in a temporary
310
container (an rvalue, incl. the default argument), then that container is
311
filled, and returned by value.
312
*/
313
314
/*!
315
\fn template <typename Haystack, typename Needle, typename...Flags> auto qTokenize(Haystack &&haystack, Needle &&needle, Flags...flags)
316
\relates QStringTokenizer
317
\since 6.0
318
319
Factory function for a QStringTokenizer that splits the string \a haystack
320
into substrings wherever \a needle occurs, and allows iteration
321
over those strings as they are found. If \a needle does not match
322
anywhere in \a haystack, a single element containing \a haystack
323
is produced.
324
325
Pass values from Qt::CaseSensitivity and Qt::SplitBehavior enumerators
326
as \a flags to modify the behavior of the tokenizer.
327
*/
328
329
QT_END_NAMESPACE
QT_BEGIN_NAMESPACE
Combined button and popup list for selecting options.
Definition
qrandomaccessasyncfile_darwin.mm:17
qtbase
src
corelib
text
qstringtokenizer.cpp
Generated on
for Qt by
1.16.1