At 61 or over, I see messages like
File "...\Python\Python39\lib\multiprocessing\connection.py", line 816, in _exhaustive_wait
res = _winapi.WaitForMultipleObjects(L, False, timeout)
ValueError: need at most 63 handles, got a sequence of length 64
60 seems to work for me.
If this causes issues for anybody else, feel free to revert.
This reverts commit d319005a3746a7661c8c9a3302266b6ff7cf61be.
Causing messages like:
File "...\Python\Python39\lib\multiprocessing\connection.py", line 816, in _exhaustive_wait
res = _winapi.WaitForMultipleObjects(L, False, timeout)
ValueError: need at most 63 handles, got a sequence of length 74
Revert the 32-process cap on Windows. When testing with Swift, we found
that there was a time reduction for testing with the higher load. This
should hopefully not matter much in practice. In the case that the
original problem with python remains with a high subprocess count, we
can easily revert this change.
Previously, if the search_env argument was specified, and the tool was
found at that location, the path was not reported, unlike other
situations when this function was called. Adding the reporting makes the
function consistent.
Reviewed by: thopre
Differential Revision: https://reviews.llvm.org/D101896
A lit feature guards tests for the lit timeout functionality because on
most system it depends on the availability of the psutil Python module.
However, that feature is defined based on the ability of the testing lit
to cancel test, which does not necessarily apply to the ability of the
tested lit.
In particular, RUN commands have a cleared PYTHONPATH and user site
packages are disabled. In the case where psutil is found by the testing
lit from one of those two source of python path, the tested lit would
not be able to find it, causing timeout tests to fail.
This commit fixes the issue by testing the ability to cancel tests in
the RUN command environment.
Reviewed By: yln
Differential Revision: https://reviews.llvm.org/D99728
This fixes cases where "not not <command>" is supposed to return
only the error codes 0 or 1, but after efee57925c3f46c74c6697,
it passed the original error code through.
This was visible on AIX in the shtest-output-printing.py testcase,
where 'wc' returns 2, while it returns 1 on other platforms, and the
test required "not not" to normalize it to 1.
Keep running "not --crash" via the external "not" executable, but
for plain negations, and for cases that use the shell "!" operator,
just skip that argument and invert the return code.
The libcxx tests only use the shell operator "!" for negations,
never the "not" executable, because libcxx tests can be run without
having a fully built llvm tree available providing the "not"
executable.
This allows using the internal shell for libcxx tests.
It should be possible to reland this now that D99938 fixed the
one test failure in clang-tidy that broke when "not" was handled
internally, letting lit/python execute grep.exe directly instead
of via not.exe. (See D99330 and D99406 for more commentery on the
exact issue that broke and other potential ways of fixing it.)
Differential Revision: https://reviews.llvm.org/D98859
This avoids breaking clang-tidy/infrastructure/validate-check-names.cpp
if 'not' is evaluated as a lit internal tool (making TestRunner
invoke 'grep' directly in that test, instead of invoking 'not', which
then invokes 'grep').
The quoting of arguments is still brittle if the executable is an
MSYS based tool though, as MSYS based tools incorrectly unescape
backslashes in quoted arguments (contrary to regular win32 argument
parsing rules), see D99406 and
https://github.com/msys2/msys2-runtime/issues/36 for more examples
of the issues.
Differential Revision: https://reviews.llvm.org/D99938
Quality of progress bar and ETA in lit has always bothered me.
For example, given `./bin/llvm-lit /repositories/llvm-project/clang/test/CodeGen* -sv`
at 1%, it says it will take 10 more minutes,
at 25%, it says it will take 1.25 more minutes,
at 50%, it says it will take 30 more seconds,
and in the end finishes with `Testing Time: 39.49s`. That's rather wildly unprecise.
Currently, it assumes that every single test will take the same amount of time to run on average.
This is is a somewhat reasonable approximation overall, but it is quite clearly imprecise,
especially in the beginning.
But, we can do better now, after D98179! We now know how long the tests took to run last time.
So we can build a better ETA predictor, by accumulating the time spent already,
the time that will be spent on the tests for which we know the previous time,
and for the test for which we don't have previous time, again use the average time
over the tests for which we know current or previous run time.
It would be better to use median, but i'm wary of the cost that may incur.
Now, on **first** run of `./bin/llvm-lit /repositories/llvm-project/clang/test/CodeGen* -sv`
at 10%, it says it will take 30 seconds,
at 25%, it says it will take 50 more seconds,
at 50%, it says it will take 27 more seconds,
and in the end finishes with `Testing Time: 41.64s`. That's pretty reasonable.
And on second run of `./bin/llvm-lit /repositories/llvm-project/clang/test/CodeGen* -sv`
at 1%, it says it will take 1 minutes,
at 25%, it says it will take 30 more seconds,
at 50%, it says it will take 19 more seconds,
and in the end finishes with `Testing Time: 39.49s`. That's amazing i think!
I think people will love this :)
Reviewed By: yln
Differential Revision: https://reviews.llvm.org/D99073
Even though we have read the times before,
we intentionally forget about it for performance reasons.
But that means we also forget all the times for the tests
that weren't executed this time. This is mildly inconvenient.
So, when recording the new times, first re-read the old times,
and update times for the tests that were executed,
thus preserving all original times, too.
I.e. when you first run lit on a directory, and then on a single test,
the timing knowledge about anything else other than that single test
is lost. This isn't right.
All of these depend on the order of tests, so if one runs them twice,
the tests within them will naturally be reordered
using the previous run times, which breaks them.
If lit was run on a directory that contained no suites,
then naturally suite[0] will not be there,
and that line would cause python warnings.
So just predicate it with a check that it is there in the first place.
This reverts commit d09adfd3993cbc1043b4d20232bce8bd774232cc.
That commit caused failures in
clang-tidy/infrastructure/validate-check-names.cpp on windows
buildbots.
That change exposed a surprising issue, not directly related to
this change in itself, but in how TestRunner quotes command line
arguments that later are going to be interpreted by a msys based
tool (like grep.exe, when provided by Git for Windows). This
worked accidentally before, when grep was invoked via not.exe
which took a more conservative approach to windows argument quoting.
When running in a Windows Container, the Git for Windows Unix tools
(C:\Program Files\Git\usr\bin) just hang if this variable isn't
passed through.
Currently, running the LLVM/clang tests in a Windows Container fails
if that directory is added to the path, but succeeds after this change.
(After this change, the previously used GnuWin tools can be left out
entirely, too, as lit automatically picks up the Git for Windows tools
if necessary.)
Differential Revision: https://reviews.llvm.org/D98858
Keep running "not --crash" via the external "not" executable, but
for plain negations, and for cases that use the shell "!" operator,
just skip that argument and invert the return code.
The libcxx tests only use the shell operator "!" for negations,
never the "not" executable, because libcxx tests can be run without
having a fully built llvm tree available providing the "not"
executable.
This allows using the internal shell for libcxx tests.
Differential Revision: https://reviews.llvm.org/D98859
The "path" recorded for timing purposes is only used as a key into a dictionary. It is never used as an actual path to a filesystem API, therefore we should use '/' as the canonical separator so that Unix and Windows machines can share timing data. This also ensures that the lit testing works across platforms.
Reviewed By: jhenderson, jmorse
Differential Revision: https://reviews.llvm.org/D98767
The test file has embedded slashes. This is fine for normal users that
are just recording and reordering paths, but not great when the trace
data is committed back to a repository that should work on both Unix and
Windows.
Lit as it exists today has three hacks that allow users to run tests earlier:
1) An entire test suite can set the `is_early` boolean.
2) A very recently introduced "early_tests" feature.
3) The `--incremental` flag forces failing tests to run first.
All of these approaches have problems.
1) The `is_early` feature was until very recently undocumented. Nevertheless it still lacks testing and is a imprecise way of optimizing test starting times.
2) The `early_tests` feature requires manual updates and doesn't scale.
3) `--incremental` is undocumented, untested, and it requires modifying the *source* file system by "touching" the file. This "touch" based approach is arguably a hack because it confuses editors (because it looks like the test was modified behind the back of the editor) and "touching" the test source file doesn't work if the test suite is read only from the perspective of `lit` (via advanced filesystem/build tricks).
This patch attempts to simplify and address all of the above problems.
This patch formalizes, documents, tests, and defaults lit to recording the execution time of tests and then reordering all tests during the next execution. By reordering the tests, high core count machines run faster, sometimes significantly so.
This patch also always runs failing tests first, which is a positive user experience win for those that didn't know about the hidden `--incremental` flag.
Finally, if users want, they can _optionally_ commit the test timing data (or a subset thereof) back to the repository to accelerate bots and first-time runs of the test suite.
Reviewed By: jhenderson, yln
Differential Revision: https://reviews.llvm.org/D98179
Visual Studios implementation of the C++ Standard Library does not use strerror to produce a message for std::error_code unlike other standard libraries such as libstdc++ or libc++ that might be used.
This patch adds a cmake script that through running a C++ program gets the error messages for the POSIX error codes and passes them onto lit through an optional config parameter.
If the config parameter is not set, or getting the messages failed, due to say a cross compiling configuration without an emulator, it will fall back to using pythons strerror functions.
Differential Revision: https://reviews.llvm.org/D98278
This patch uses the errno python library to print out the correct error messages instead of hardcoding the error message per platform.
Reviewed By: jhenderson, ASDenysPetrov
Differential Revision: https://reviews.llvm.org/D97472
For some build configurations, `check-all` calls lit multiple times to
run multiple lit test suites. Most recently, I've found this to be
true when configuring openmp as part of `LLVM_ENABLE_RUNTIMES`, but
this is not the first time.
If one test suite fails, none of the remaining test suites run, so you
cannot determine if your patch has broken them. It can then be
frustrating to try to determine which `check-` targets will run the
remaining tests without getting stuck on the failing tests.
When such cases arise, it is probably best to adjust the cmake
configuration for `check-all` to run all test suites as part of one
lit invocation. Because that fix will likely not be implemented and
land immediately, this patch introduces `--ignore-fail` to serve as a
workaround for developers trying to see test results until it does
land:
```
$ LIT_OPTS=--ignore-fail ninja check-all
```
One problem with `--ignore-fail` is that it makes it challenging to
detect test failures in a script, perhaps in CI. This problem should
serve as motivation to actually fix the cmake configuration instead of
continuing to use `--ignore-fail` indefinitely.
Reviewed By: jhenderson, thopre
Differential Revision: https://reviews.llvm.org/D96371
In semi-automated environments, XFAILing or filtering out known regressions without actually committing changes or temporarily modifying the test suite can be quite useful.
Reviewed By: yln
Differential Revision: https://reviews.llvm.org/D96662
With enough cores, the slowest tests can significantly change the total testing time if they happen to run late. With this change, a test suite can improve performance (for high-end systems) by listing just a few of the slowest tests up front.
Reviewed By: jdenny, jhenderson
Differential Revision: https://reviews.llvm.org/D96594
Some test systems do not use lit for test discovery but only for its
substitution and test selection because they use another way of managing
test collections, e.g. CTest. This forces those tests to be invoked with
lit --no-indirectly-run-check. When a mix of lit version is in use, it
requires to detect the availability of that option.
This commit provides a new config option standalone_tests to signal a
directory made of tests meant to run as standalone. When this option is
set, lit skips test discovery and the indirectly run check. It also adds
the missing documentation for --no-indirectly-run-check.
Reviewed By: jdenny
Differential Revision: https://reviews.llvm.org/D94766
On z/OS, other error messages are not matched correctly in lit tests.
```
EDC5121I Invalid argument.
EDC5111I Permission denied.
```
This patch adds a lit substitution to fix it.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D95808
On z/OS, the following error message is not matched correctly in lit tests.
```
EDC5129I No such file or directory.
```
This patch uses a lit config substitution to check for platform specific error messages.
Reviewed By: muiez, jhenderson
Differential Revision: https://reviews.llvm.org/D95246
The initial problem with the remaining bot config was resolved.
We can now use Python3. Let's use `os.cpu_count()` to cleanup this
helper.
Differential Revision: https://reviews.llvm.org/D94734
Update shebang to always use Python3 when executing `lit.py` directly.
A previous change of mine [1] revealed that we still use Python2 on some
bot configurations that invoke `llvm/utils/lit/lit.py` as a script
directly (instead of `python3 path/to/lit.py`).
[1] https://reviews.llvm.org/D94734
Differential Revision: https://reviews.llvm.org/D95393
A bot owner contacted me. I will re-land after confirming that this
doesn't break anyone (since it's low priority).
This reverts commit 9946b169c379daee603436a4753acfef8be373dd.
In addition to consistency, we'll hit a wall when 11.1.0 gets released, because
we cannot represent it with lit versioning scheme.
Differential Revision: https://reviews.llvm.org/D94157