Brings a 30% speed boost on x86_64 even though we still process only
one block at a time for now.
Only enabled on x86_64 since the non-vectorized implementation seems
to currently perform better on some architectures (at least on aarch64).
But the non-vectorized implementation still gets a little speed boost
as well (~17%) with these changes.
* Add an optimized squaring routine under the `sqr` name.
Algorithms for squaring bigger numbers efficiently will come in a
PR later.
* Fix a bug where a multiplication was done twice if the threshold for
the use of Karatsuba algorithm was crossed. Add a test to make sure
this won't happen again.
* Streamline `pow` method, take a `Const` parameter.
* Minor tweaks to `pow`, avoid bit-reversing the exponent.
* std.fs.Dir.readFile: add doc comments to explain what it means when
the returned slice has the same length as the supplied buffer.
* introduce readSmallFile / writeSmallFile to abstract over the
decision to use symlink or file contents to store data.
Conflicts:
src/clang.zig
Master branch renamed an enum; this branch gave it an explicit tag type
and explicitly initialized values. This commit combines the changes
together.
Conflicts:
cmake/Findllvm.cmake
The llvm11 branch changed 10's to 11's and master branch added the
"using LLVM_CONFIG_EXE" help message, so the resolution was to merge
these changes together.
I also added a check to make sure LLVM is built with AVR enabled, which
is no longer an experimental target.
* std.fs.File.copyRange and copyRangeAll return u64 instead of usize -
the returned value is how much of the `len` is transferred, so the
types should match. This removes the need for an `@intCast`.
* fix typo that removed a subtraction
* Fix the size of codegen.AnyMCValue which gave me a compile error when
I tried to build self-hosted for i386-linux.
* restore the coercion to u64 of syms_sect.sh_info. We want to make
sure the multiplication happens with 64 bits and not the smaller type
used by the ELF format.
* fix another offset parameter in link/Elf.zig to be u64 instead of usize
* add a nice little TODO note to help out Jakub
* FmtError already has FileTooBig in it; we just need to return it.
On some distros (e.g. Void Linux) the release field of the tsname
struct may contain an underscore followed by a revision number at the
end. (e.g. 5.8.12_2).
* Correctly scan all the exponent bits, this caused the incorrect result
to be computed for exponents being powers of two.
* Allocate enough limbs to make llmulacc stop whining.
This reverts commit 70f3767903.
After discussion, I can see the value provided here, specifically with
avoiding the footgun of defer { suspend { free(@frame()); } }.
However the doc comments are updated to explain the semantics directly,
rather than basing them on the behavior of another programming language.
Before this it was trying to copy all the files from the zig-cache dir
to the output dir, but in the current compiler architecture the cache
dir is also used for internal compiler files.