|
Code GCC produces that makes you cry #12684 |
|
This is a long thread. Click here to view the threaded list. |
|
Simon Willcocks |
Message #124444, posted by Stoppers at 19:00, 13/2/2019, in reply to message #124443 |
Member
Posts: 302
|
I'd mixed up what the parameters meant, so the last value was too large, which meant that the function would have returned null... |
|
[ Log in to reply ] |
|
Jeffrey Lee |
Message #124445, posted by Phlamethrower at 19:08, 13/2/2019, in reply to message #124443 |
Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot stuff
Posts: 15100
|
(Those first two instructions are interesting, as well!) As a guess, I'd say you're looking at an unlinked object file, in which case the instruction at 284 could be a placeholder which will later get patched with a proper value by the linker.
The rest probably only makes sense if I knew what the rest of the code looked like |
|
[ Log in to reply ] |
|
Simon Willcocks |
Message #124446, posted by Stoppers at 23:22, 13/2/2019, in reply to message #124445 |
Member
Posts: 302
|
I wondered about that, too, but:
built_drivers/memory_management.elf: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), statically linked, with debug_info, not stripped
Moving slowly on... |
|
[ Log in to reply ] |
|
Jeffrey Lee |
Message #124502, posted by Phlamethrower at 13:14, 3/6/2019, in reply to message #124446 |
Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot stuff
Posts: 15100
|
GCC's NEON auto-vectorisation is abysmal.
This article is a good starting point for all the things you need to tweak to get vectorisation working nicely on x86. But ARM NEON instructions will work happily with word-aligned data, so there won't be any need to specify alignment for your data, right? Wrong.
This produces nice tight code:
#include <stdlib.h> #include <math.h>
#define SIZE (1L << 16)
void test1(int * __restrict a, int * __restrict b) { int i;
int *x = (int *) __builtin_assume_aligned(a, 4); int *y = (int *) __builtin_assume_aligned(b, 8);
for (i = 0; i < SIZE; i++) { x[i] += y[i]; } }
This also produces nice code:
#include <stdlib.h> #include <math.h>
#define SIZE (1L << 16)
void test1(int * __restrict a, int * __restrict b) { int i;
int *x = (int *) __builtin_assume_aligned(a, 8); int *y = (int *) __builtin_assume_aligned(b, 4);
for (i = 0; i < SIZE; i++) { x[i] += y[i]; } }
This produces an 80-instruction monstrosity:
#include <stdlib.h> #include <math.h>
#define SIZE (1L << 16)
void test1(int * __restrict a, int * __restrict b) { int i;
int *x = (int *) __builtin_assume_aligned(a, 4); int *y = (int *) __builtin_assume_aligned(b, 4);
for (i = 0; i < SIZE; i++) { x[i] += y[i]; } }
Yep - one of the pointers needs to be at least doubleword-aligned, otherwise it adds a bunch of extra code to try and deal with imagined alignment issues.
[Edited by Phlamethrower at 14:15, 3/6/2019] |
|
[ Log in to reply ] |
|
Jeffrey Lee |
Message #124503, posted by Phlamethrower at 21:48, 3/6/2019, in reply to message #124502 |
Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot stuff
Posts: 15100
|
RISC OS GCC 4.7.4 seems to be a bit more finnicky, requiring at least one of the variables to be 16 byte aligned to avoid nastyness.
I can understand some of its logic - the code will be faster if the pointers are aligned, especially for large buffers, so it makes sense to add extra code to bring one of the pointers into alignment. But why is the extra code so bloody long? You can write the same thing in about half as many instructions.
[Edited by Phlamethrower at 22:48, 3/6/2019] |
|
[ Log in to reply ] |
|
Jeffrey Lee |
Message #124574, posted by Phlamethrower at 11:17, 22/9/2019, in reply to message #124503 |
Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot Hot stuff
Posts: 15100
|
New clang optimisations!
http://releases.llvm.org/9.0.0/docs/ReleaseNotes.html#id4
I should really give it a go at some point. |
|
[ Log in to reply ] |
|
Chris Johns |
Message #124712, posted by cmj at 09:09, 21/1/2020, in reply to message #124574 |
Member
Posts: 9
|
Has anyone looked at what would be involved in getting clang / llvm to target RISC OS. I did vaguely wonder if it could be persuaded to output AIF / AOF at one point (although didn't really get far with that), but I suspect ELF would be easier and more sensible. |
|
[ Log in to reply ] |
|
Rob Kendrick |
Message #124713, posted by nunfetishist at 09:38, 21/1/2020, in reply to message #124712 |
Today's phish is trout a la creme.
Posts: 524
|
Has anyone looked at what would be involved in getting clang / llvm to target RISC OS. I did vaguely wonder if it could be persuaded to output AIF / AOF at one point (although didn't really get far with that), but I suspect ELF would be easier and more sensible. I think John Tytgat looked into this, but I seem to recall that RISC OS's poor memory model makes it tricky because of all the function pre and post ambles for the stack to work. |
|
[ Log in to reply ] |
|
Carlos Li |
Message #125718, posted by Carlos at 06:42, 21/11/2024, in reply to message #124443 |
Member
Posts: 1
|
What's up with Borland?
_________________________________________ Skysmotor sells the following products online: Planetary Reducer, Hybrid Stepper Motor, Spindle Motor, Stepper Motor and can be purchased online if needed.
[Edited by Carlos at 06:47, 21/11/2024]
[Edited by Carlos at 06:55, 21/11/2024] |
|
[ Log in to reply ] |
|
Pages (2): |< <
2
|