The new documentation entry gives an example use case from
libomptarget.
Reviewed By: yln, jhenderson, davezarzycki
Differential Revision: https://reviews.llvm.org/D105208
This patch augments Lit with the ability to parse regular expressions
in boolean expressions. This includes REQUIRES:, XFAIL:, UNSUPPORTED:,
and all other special Lit markup that evaluates to a boolean expression.
Regular expressions can be specified by enclosing them in {{...}},
similarly to how FileCheck handles such regular expressions. The regular
expression can either be on its own, or it can be part of an identifier.
For example, a match expression like {{.+}}-apple-darwin{{.+}} would match
the following variables:
x86_64-apple-darwin20.0
arm64-apple-darwin20.0
arm64-apple-darwin22.0
etc...
In the long term, this could be used to remove the need to handle the
target triple specially when parsing boolean expressions.
Differential Revision: https://reviews.llvm.org/D104572
In such cases, the executables are not in the llvm_tools_dir directory, so we need to look in the other search locations. Previously, they were found via the PATH, but this was disabled by default in commit rGa1e6565.
Depends on D103154.
Reviewed By: thopre
Differential Revision: https://reviews.llvm.org/D103156
This updates the googletest format to support tests that use GTEST_SKIP(),
which is now available with the updated googletest framework.
Differential Revision: https://reviews.llvm.org/D102694
This allows tests to detect whether to run or not, dependent on which
LLD version is required for the test.
Reviewed by: thopre
Differential Revision: https://reviews.llvm.org/D101997
Previously, if the search_env argument was specified, and the tool was
found at that location, the path was not reported, unlike other
situations when this function was called. Adding the reporting makes the
function consistent.
Reviewed by: thopre
Differential Revision: https://reviews.llvm.org/D101896
A lit feature guards tests for the lit timeout functionality because on
most system it depends on the availability of the psutil Python module.
However, that feature is defined based on the ability of the testing lit
to cancel test, which does not necessarily apply to the ability of the
tested lit.
In particular, RUN commands have a cleared PYTHONPATH and user site
packages are disabled. In the case where psutil is found by the testing
lit from one of those two source of python path, the tested lit would
not be able to find it, causing timeout tests to fail.
This commit fixes the issue by testing the ability to cancel tests in
the RUN command environment.
Reviewed By: yln
Differential Revision: https://reviews.llvm.org/D99728
This fixes cases where "not not <command>" is supposed to return
only the error codes 0 or 1, but after efee57925c3f46c74c6697,
it passed the original error code through.
This was visible on AIX in the shtest-output-printing.py testcase,
where 'wc' returns 2, while it returns 1 on other platforms, and the
test required "not not" to normalize it to 1.
Keep running "not --crash" via the external "not" executable, but
for plain negations, and for cases that use the shell "!" operator,
just skip that argument and invert the return code.
The libcxx tests only use the shell operator "!" for negations,
never the "not" executable, because libcxx tests can be run without
having a fully built llvm tree available providing the "not"
executable.
This allows using the internal shell for libcxx tests.
It should be possible to reland this now that D99938 fixed the
one test failure in clang-tidy that broke when "not" was handled
internally, letting lit/python execute grep.exe directly instead
of via not.exe. (See D99330 and D99406 for more commentery on the
exact issue that broke and other potential ways of fixing it.)
Differential Revision: https://reviews.llvm.org/D98859
Even though we have read the times before,
we intentionally forget about it for performance reasons.
But that means we also forget all the times for the tests
that weren't executed this time. This is mildly inconvenient.
So, when recording the new times, first re-read the old times,
and update times for the tests that were executed,
thus preserving all original times, too.
I.e. when you first run lit on a directory, and then on a single test,
the timing knowledge about anything else other than that single test
is lost. This isn't right.
All of these depend on the order of tests, so if one runs them twice,
the tests within them will naturally be reordered
using the previous run times, which breaks them.
The "path" recorded for timing purposes is only used as a key into a dictionary. It is never used as an actual path to a filesystem API, therefore we should use '/' as the canonical separator so that Unix and Windows machines can share timing data. This also ensures that the lit testing works across platforms.
Reviewed By: jhenderson, jmorse
Differential Revision: https://reviews.llvm.org/D98767
The test file has embedded slashes. This is fine for normal users that
are just recording and reordering paths, but not great when the trace
data is committed back to a repository that should work on both Unix and
Windows.
Lit as it exists today has three hacks that allow users to run tests earlier:
1) An entire test suite can set the `is_early` boolean.
2) A very recently introduced "early_tests" feature.
3) The `--incremental` flag forces failing tests to run first.
All of these approaches have problems.
1) The `is_early` feature was until very recently undocumented. Nevertheless it still lacks testing and is a imprecise way of optimizing test starting times.
2) The `early_tests` feature requires manual updates and doesn't scale.
3) `--incremental` is undocumented, untested, and it requires modifying the *source* file system by "touching" the file. This "touch" based approach is arguably a hack because it confuses editors (because it looks like the test was modified behind the back of the editor) and "touching" the test source file doesn't work if the test suite is read only from the perspective of `lit` (via advanced filesystem/build tricks).
This patch attempts to simplify and address all of the above problems.
This patch formalizes, documents, tests, and defaults lit to recording the execution time of tests and then reordering all tests during the next execution. By reordering the tests, high core count machines run faster, sometimes significantly so.
This patch also always runs failing tests first, which is a positive user experience win for those that didn't know about the hidden `--incremental` flag.
Finally, if users want, they can _optionally_ commit the test timing data (or a subset thereof) back to the repository to accelerate bots and first-time runs of the test suite.
Reviewed By: jhenderson, yln
Differential Revision: https://reviews.llvm.org/D98179
For some build configurations, `check-all` calls lit multiple times to
run multiple lit test suites. Most recently, I've found this to be
true when configuring openmp as part of `LLVM_ENABLE_RUNTIMES`, but
this is not the first time.
If one test suite fails, none of the remaining test suites run, so you
cannot determine if your patch has broken them. It can then be
frustrating to try to determine which `check-` targets will run the
remaining tests without getting stuck on the failing tests.
When such cases arise, it is probably best to adjust the cmake
configuration for `check-all` to run all test suites as part of one
lit invocation. Because that fix will likely not be implemented and
land immediately, this patch introduces `--ignore-fail` to serve as a
workaround for developers trying to see test results until it does
land:
```
$ LIT_OPTS=--ignore-fail ninja check-all
```
One problem with `--ignore-fail` is that it makes it challenging to
detect test failures in a script, perhaps in CI. This problem should
serve as motivation to actually fix the cmake configuration instead of
continuing to use `--ignore-fail` indefinitely.
Reviewed By: jhenderson, thopre
Differential Revision: https://reviews.llvm.org/D96371
In semi-automated environments, XFAILing or filtering out known regressions without actually committing changes or temporarily modifying the test suite can be quite useful.
Reviewed By: yln
Differential Revision: https://reviews.llvm.org/D96662
With enough cores, the slowest tests can significantly change the total testing time if they happen to run late. With this change, a test suite can improve performance (for high-end systems) by listing just a few of the slowest tests up front.
Reviewed By: jdenny, jhenderson
Differential Revision: https://reviews.llvm.org/D96594
Some test systems do not use lit for test discovery but only for its
substitution and test selection because they use another way of managing
test collections, e.g. CTest. This forces those tests to be invoked with
lit --no-indirectly-run-check. When a mix of lit version is in use, it
requires to detect the availability of that option.
This commit provides a new config option standalone_tests to signal a
directory made of tests meant to run as standalone. When this option is
set, lit skips test discovery and the indirectly run check. It also adds
the missing documentation for --no-indirectly-run-check.
Reviewed By: jdenny
Differential Revision: https://reviews.llvm.org/D94766
Don't produce or expect any output from the infinite looping test -
doing so is a recipe for racey flakyness without a longer timeout to
ensure the output is received first, even though that doesn't seem
integral/important to the test. Instead have a plain, no output infinite
loop and check that that is caught and handled.
If for some reason the output is valuable for test coverage - the
timeout should be increased from 1 second to give the process time to
output the text, flush, and for that text to be received and buffered
before the test is timed out.
On some of the slow or heavily-loaded bots, this test was failing
intermittently because the infinite_loop.py script might not emit
anything to stdout before the 1 second timeout, so the "Command Output"
line isn't present in the output. That output isn't really important to
this test, we just care that the process is killed, so we can just rmove
that check line from the test.
Differential revision: https://reviews.llvm.org/D92563
Currently, a LIT test named directly (on the command line) will
be run even if the name of the test file does not meet the rules
to be considered a test in the LIT test configuration files for
its test suite. For example, if the test does not have a
recognised file extension.
This makes it relatively easy to write a LIT test that won't
actually be run. I did in: https://reviews.llvm.org/D82567
This patch adds an error to avoid users doing that. There is a
small performance overhead for this check. A command line option
has been added so that users can opt into the old behaviour.
Differential Revision: https://reviews.llvm.org/D83069
The tests previously relied on the `short.py` and `FirstTest.subTestA`
script being executed on a machine within a short time window (1 or 2
seconds). While this "seems to work" it can fail on resource constrained
machines. We could bump the timeout a little bit (bumping it too
much would mean the test would take a long time to execute) but it wouldn't
really solve the problem of the test being prone to failures.
This patch tries to remove this flakeyness by separating testing into
two separate parts:
1. Testing if a test can hit a timeout.
2. Testing if a test can run to completion in the presence of a
timeout.
This way we can give (1.) a really short timeout (to make the test run
as fast as possible) and (2.) a really long timeout. This means for (2.)
we are no longer trying to rely on the "short" test executing within
some short time window. Instead the window is now 3600 seconds which
should be long enough even for a heavily resource constrained machine to
execute the "short" test.
Thanks to Julian Lettner for suggesting this approach. This superseeds
my original approach in https://reviews.llvm.org/D88807.
This patch also changes the command line override test to run the quick
test rather than the slow one to make the test run faster.
Differential Revision: https://reviews.llvm.org/D89020
Failing test output sometimes contains control characters like \x1b (e.g.
if there was some -fcolor-diagnostics output) which are not allowed inside
XML files. This causes problems with CI systems: for example, the Jenkins
JUnit XML will throw an exception when ecountering those characters and
similar problems also occur with GitLab CI.
Reviewed By: yln, jdenny
Differential Revision: https://reviews.llvm.org/D84233
Otherwise, if a Lit script contains escaped substitutions (like %%p in this test https://github.com/llvm/llvm-project/blob/master/compiler-rt/test/asan/TestCases/Darwin/asan-symbolize-partial-report-with-module-map.cpp#L10), they are unescaped during recursive application of substitutions, and the results are unexpected.
We solve it using the fact that double percent signs are first replaced with #_MARKER_#, and only after all the other substitutions have been applied, #_MARKER_# is replaced with a single percent sign. The only change is that instead of replacing #_MARKER_# at each recursion step, we replace it once after the last recursion step.
Differential Revision: https://reviews.llvm.org/D83894
When running multiple shards, don't include skipped tests in the xunit
output since merging the files will result in duplicates.
In our CHERI Jenkins CI, I configured the libc++ tests to run using sharding
(since we are testing using a single-CPU QEMU). We then merge the generated
XUnit xml files to produce a final result, but if the individual XMLs
report tests excluded due to sharding each test is included N times in the
final result. This also makes it difficult to find the tests that were
skipped due to missing REQUIRES: etc.
Reviewed By: yln
Differential Revision: https://reviews.llvm.org/D84235
As per discussion in D69207, have lit ignore UnicodeDecodeErrors
when running with python 2 in an ASCII shell.
Differential Revision: https://reviews.llvm.org/D82754
Provide `--show-xxx` flags for all non-failure result codes, just as we
already do for `--show-xfail` and `--show-unsupported`.
Reviewed By: jdenny
Differential Revision: https://reviews.llvm.org/D82233
Summary: This is a complementary patch to D82100 since the aix builbot is still running the unsupported test shtest-format-argv0. Add system-aix to the sub llvm-lit config.
Reviewers: daltenty, hubert.reinterpretcast
Reviewed By: hubert.reinterpretcast
Subscribers: delcypher, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82905
Summary: lit test `shtest-format.py` fails on AIX because one of the subtest of shtest-format requires the tool `[` to be installed under the system PATH. For AIX, `[` is only available as a shell builtin and does not present as an executable file under PATH. Hence, split the original shtest-format into two separate test files and added AIX as UNSUPPORTED for the test using `[` .
Reviewers: daltenty, hubert.reinterpretcast
Reviewed By: hubert.reinterpretcast
Subscribers: delcypher, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82100
Let's have one canonical place to define ResultCode instances and their
labels.
Also make ResultCode's `__init__` function self-registering to better
support custom ResultCodes.
Before this change we showed all result groups with a code that was not
explicitly hard-coded set. This set missed the FLAKYPASS result code.
Let's generalize the code to always show failures and the additionally
requested result codes.
MSVC uses lit for STL testing to run both the libcxx tests and our "native" suite of tests which has feature requirements that are not parsed from the test content. For consistency, the change treats the `unsupported` and `xfails` `Test` properties similarly to `requires`.
Differential Revision: https://reviews.llvm.org/D81782
Pass in all discovered tests to report generators.
The XunitReport generator now creates testcase items for unexecuted
tests and documents why they have been skipped. This makes it easier
to compare test runs with different filters or configurations, or across
platforms.
I don't know who is using the JsonReport generator and what the
expectations there are (it doesn't have tests), so decided to preserve
the old behavior by filtering out the unexecuted tests.
Reviewed By: jdenny
Differential Revision: https://reviews.llvm.org/D81316
In TestRunner.py, D78589 extracts a `_parseKeywords` function from
`parseIntegratedTestScript`, which then expects `_parseKeywords` to
always return a list of keyword/value pairs. However, the extracted
code sometimes returns an unresolved `lit.Test.Result` on a keyword
parsing error, which then produces a stack dump instead of the
expected diagnostic.
This patch fixes that, makes the style of those diagnostics more
consistent, and extends the lit test suite to cover them.
Reviewed By: ldionne
Differential Revision: https://reviews.llvm.org/D81665
Having the input dumped on failure seems like a better
default: I debugged FileCheck tests for a while without knowing
about this option, which really helps to understand failures.
Remove `-dump-input-on-failure` and the environment variable
FILECHECK_DUMP_INPUT_ON_FAILURE which are now obsolete.
Differential Revision: https://reviews.llvm.org/D81422
Improve consistency when printing test results:
Previously we were using different labels for group names (the header
for the list of, e.g., failing tests) and summary count lines. For
example, "Failing Tests"/"Unexpected Failures". This commit changes lit
to label things consistently.
Improve wording of labels:
When talking about individual test results, the first word in
"Unexpected Failures", "Expected Passes", and "Individual Timeouts" is
superfluous. Some labels contain the word "Tests" and some don't.
Let's simplify the names.
Before:
```
Failing Tests (1):
...
Expected Passes : 3
Unexpected Failures: 1
```
After:
```
Failed Tests (1):
...
Passed: 3
Failed: 1
```
Reviewed By: ldionne
Differential Revision: https://reviews.llvm.org/D77708