Yep
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Mar 29 2017
Mar 28 2017
Andre, can we close this bug?
1.9.0 has been released.
Yes, print *a was correct. Could you please do
print *sc->load_stack[sc->file_i]->curr_line
there?
Thanks, sounds like you have plans to address all three of the problems then.
Cheers
Indeed. I raised the limit to 5, do you think that this is ok?
I see. Let's get back to this after the release of 1.9
The keyserver helpers programs which are the cause for some not too useful error
messages have been removed from 2.1. Thus the error messages are different and
might be better - at least the dirmngr, responsible for fetching keys, can
create a detailed log file.
I tagged this as wontfix because we won't do any chnages to 2.0 anymore, its
EOLed for the end of the year.
Please feel free to re-open this bug if you experience such problems asl with
2.1.19 or higher.
Thanks for reporting.
Hi!
re 1. It is pretty new that the release notes are linked to the NEWS files. We
have a script to do this but it still needs a manual build. Have not yet done
that. For 10 days or so we again have an autobuilder for the website which can
take over the manual build step. Needs to be done. I keep this bug open to
track this.
re 2. The labels attached to the branches aused more confusion than they helped.
The plan is to remove 2.0 entirely (its EOF is in 9 months). The website
should only prominently only show the stble version. And yes, "modern" has been
stable as well.
re 3. To avoid confusion 1.4 has mostly been removed from the frontpage. Same
reason as above. However, somewhere we need to state this.
I've now pulled from the current master head
(caf00915532e6e8e509738962964edcd14fb0654), rebuilt on zelenka with -O0 -g, and
triggered the error again, causing a core file to be dumped.
I copied gpgscm-gdb.py into tests/gpgscm/ , added it to add-auto-load-safe-path
in ~/.gdbinit, and then ran "gdb -c tests/gpgscm/core tests/gpgscm/gpgscm" and
tried to print a, as requested. here's what i got:
0 (sid_s390x-dchroot)dkg@zelenka:~/src/gnupg2/gnupg2/build$ echo
add-auto-load-safe-path
/home/dkg/src/gnupg2/gnupg2/build/tests/gpgscm/gpgscm-gdb.py > /home/dkg/.gdbinit
0 (sid_s390x-dchroot)dkg@zelenka:~/src/gnupg2/gnupg2/build$ gdb -c
tests/gpgscm/core ./tests/gpgscm/gpgscm
GNU gdb (Debian 7.12-6) 7.12.0.20161007-git
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later < GPL license >
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "s390x-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
< GDB Bugs >.
Find the GDB manual and other documentation resources online at:
< GDB Documentation >.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./tests/gpgscm/gpgscm...done.
[New LWP 7145]
Core was generated by `./gpgscm ../../../tests/gpgscm/t-child.scm'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x000002aae4ecf748 in is_vector (p=0x4634508) at
../../../tests/gpgscm/scheme.c:220
220 INTERFACE INLINE int is_vector(pointer p) { return (type(p)==T_VECTOR); }
(gdb) bt
#0 0x000002aae4ecf748 in is_vector (p=0x4634508) at
../../../tests/gpgscm/scheme.c:220
#1 0x000002aae4ed3470 in vector_elem (vec=0x4634508, ielem=7) at
../../../tests/gpgscm/scheme.c:1349
#2 0x000002aae4ed975e in tailstack_flatten (sc=0x2ab046296f0,
tailstack=0x4634508, i=8, n=7, acc=0x2ab04629838) at
../../../tests/gpgscm/scheme.c:3117
#3 0x000002aae4ed99d4 in callstack_flatten (sc=0x2ab046296f0, i=8, n=7,
acc=0x2ab04629838) at ../../../tests/gpgscm/scheme.c:3155
#4 0x000002aae4ed9af0 in history_flatten (sc=0x2ab046296f0) at
../../../tests/gpgscm/scheme.c:3173
#5 0x000002aae4ed8488 in _Error_1 (sc=0x2ab046296f0, s=0x2aae4efe634 "eval:
unbound variable:", a=0x2ab0462bdd8) at ../../../tests/gpgscm/scheme.c:2777
#6 0x000002aae4eda162 in opexe_0 (sc=0x2ab046296f0, op=OP_EVAL) at
../../../tests/gpgscm/scheme.c:3298
#7 0x000002aae4ee3ef0 in Eval_Cycle (sc=0x2ab046296f0, op=OP_T0LVL) at
../../../tests/gpgscm/scheme.c:5358
#8 0x000002aae4ee5384 in scheme_load_named_file (sc=0x2ab046296f0,
fin=0x2ab04684f90, filename=0x2ab04684d80 "../../../tests/gpgscm/init.scm") at
../../../tests/gpgscm/scheme.c:5748
#9 0x000002aae4ec1ec6 in load (sc=0x2ab046296f0, file_name=0x2aae4efc7d4
"init.scm", lookup_in_cwd=0, lookup_in_path=1) at ../../../tests/gpgscm/main.c:180
#10 0x000002aae4ec22cc in main (argc=0, argv=0x3ffffe44e48) at
../../../tests/gpgscm/main.c:266
(gdb) up 5
#5 0x000002aae4ed8488 in _Error_1 (sc=0x2ab046296f0, s=0x2aae4efe634 "eval:
unbound variable:", a=0x2ab0462bdd8) at ../../../tests/gpgscm/scheme.c:2777
2777 history = history_flatten(sc);
(gdb) print a
$1 = (pointer) 0x2ab0462bdd8
(gdb) print *a
$2 = define-macro
(gdb)
maybe i'm doing something wrong? i'll ask and see whether i can give out an
account on the porterbox for you, justus.
What about gpgme_get_dirinfo ("agent-socket")?
I did not know about that, and that helps a bit, but has the downside that it
uses the GNUPGHOME from the process' environment.
I'm thinking about the following use case. I have created an ephemeral home
directory to contain the results or side-effects of some operation, and now I
want to talk to the agent that serves that particularly home directory. I
cannot use gpgme_get_dirinfo because that uses GNUPGHOME, and I don't want to
change the environment variable because that is a process-global thing and I
don't want to interfere with other threads.
I think that NetBSD also defines single thread version of pthread_* functions in
libc.
How about attached patch in configure.ac?
(You need to generate configure)
It seems that -lrt is required on NetBSD.
Mar 27 2017
What about
gpgme_get_dirinfo ("agent-socket")
? For testing you can use
GNUPGHOME=/foo/bar gpgme/tests/t-engine-info 2>&1 | grep agent-info
I have looked into this. I installed Debian on an s390 emulator (hercules), but
have been unable to reproduce the problem there, maybe due to the emulation (it
is quite slow on my system, and the gpgscm interpreter seems especially slow,
maybe because of the challenge of doing branch prediction on interpreters).
Your stack trace suggests a memory corruption early during the initialization
("init.scm", the standard library, is being loaded), we see an error being
generated due to an unbound variable (i.e. the environment hash table is
corrupted / does not perform as expected). Then we see a segfault while the
history buffer is flattened into a list for the error message (i.e. hints at a
corruption).
Unfortunately, memory corruption bugs are very hard to detect in gpgscm due to
its use of a custom memory allocator. The allocator allocates large segments
using malloc and hands out cells from that pool as necessary. However, memory
is never freed, so tools like valgrind can not be used to detect use-after-free,
or even most out-of-bounds accesses.
I have been working on the low-level allocator last week trying to make it more
debuggable and memory errors more detectable, e.g. by moving parts of the
interpreter into readonly sections.
As Werner said, a stack trace with less optimizations would be helpful. Also,
is the problem always the same if it happens? If so, it would be interesting to
know what kind of variable is unbound (for that, inspect the 'a' parameter of
'_Error_1' [I'm attaching a pretty-printer for gdb, with that, do 'print a']).
Access to the porter box would be helpful as well.
As of 348da58fe0c3656e6177c98fef6b4c4331326c8e all Python tests are skipped with
GnuPG < 2.1.12.
Thanks very much! I have solved the problem.
Mar 26 2017
Please do not post files in closed formats like Microsoft word. We will only
look at reports in a plain text format.
From your description it looks more like a build problem because Libgcrypt is
already part of Ubuntu and installing a different version is possible but you
need to get some things right. In general I would suggest to write to
gcrypt-devel@gnupg.org
Mar 25 2017
Can you rebuild using -O0 -g and try to get a back trace again. That might be
helpful.
Mar 24 2017
We also have a discussion of the mailing list. It does currently not make sense
to continue here.
The problem of NFS mounted home directories is _real_ and we have a solution for
this which is better than the old redirection hack.
The problem with too long socket names is not severe and has been around for
decades (for other software and 14 years for GnuPG). There are workaround and
/run/user also solves this.
I proposed a change which does not even require --create-socketdir. There was
no comment on this and thus I will push that now so that we can do a real life test.
I concur. We should disable the Python tests for gnupg versions < 2.1.12 (which
is about a year old)
I've rebased the patches against 1.8.0 but I still saw 22 failing python tests
with 2.0.26
Master fails for me even harder with 36 tests failing.
The gpg-connect-agent call's fail because --agent-program is not supported. In
master we even have --debug-quick-random which is even more recent (but which we
would also need in random starved environments like build daemons)
My preferred solution at this point would be to just say for 2.0.x the python
tests are unsupported and disabled completely. All the problems are with our
agent setup regarding the test suite and not really with functionality.
Justus: I told you several times that we are not going to change working code
for no good reason.
Except that it is not working. If it was working, then
06f1f163e96f1039304fd3cf565cf9de1ca45849
would not be necessary.
Even if your hack (I call it a hack because it does not
work with getsockname)
1/ Yes it does. It returns precisely the path that was used in bind.
2/ We only use getsockname on sockets that were given us by a service manager
like systemd, and thus those sockets would be unaffected by "the hack".
would make it, it does not solve the major problem: The
inability of creating sockets on certain file systems. THAT is the major reason
why we moved to /var/run.
Please stop conflating these things. This bug is about "dirmngr and gpg-agent
should work automatically even when GNUPGHOME is larger than sun_path". It is
not about NFS or FAT or something.
Mar 23 2017
Mar 22 2017
Roundup won't let me include the details, but i will say that from a git bisect,
i discovered that the first commit that has this behavior is
49e2ae65e892f93be7f87cfaae3392b50a99e4b1 ("gpgscm: Use a compact vector
representation.")
The crashes that happen are segfaults.
Hello Werner,
The problem is, that some projects liek gpgtools for MacOS are reluctantly sticking to
gnupg-2.0 :-/
So, I'd love to have this patch committed in order to ease the transition phase from
2.0 to 2.1 for them.
Regards, Wolfgang
Oh yes, then we should include NetBSD at least into the nPth and libgpg-error
builds.
NetBSD has its own pthread library (different from OpenBSD and FreeBSD), so I
think this would be a good idea.
Our jenkins has no problems building nPth for OpenBSD 6.0.
wiz: Do you think that NetBSD (x86 I assume) is much different than OpenBSD so
that we would benefit from adding NetBSD to our Jenkins builds?
gniibe: Do you have time to look into this?
The log indeed does not match the former gpgme2.trace.
I would try to switch to gpgme 1.8.0 so see whether you can still reproduce the
problem.
Given that 2.0 will reach EOL in 9 months I don't think it is worth to backport
and test that patch.
Right now it seems to be complaining about an expired key.
(Could be that this is a different error than originally reported because it's
been some time...)
Mar 21 2017
Sorry, I thought I would receive an email when this was updated.
We don't do this on-the-fly to avoid cluttering the /run/user with directories.
So, I was expecting gnupg to use /run/user/$UID/gnupg/. However, if GNUPGHOME is
set, it uses /run/user/$UID/gnupg/d.$GNUPGHOME_HASH/. Therefore, by "littering",
I assume you mean littering /run/user/$UID/gnupg/ (otherwise this argument makes
no sense).
I'm leaving this here for future readers as I can't find *any* documentation of
this behavior (the use of d.$GNUPGHOME_HASH).
---
Regardless, my actual goal is to move the homedir to ~/.local/share/gnupg (not
have multiple homedirs) as described in (T1456)
so I really do want sockets to go in /run/user/$UID/gnupg/. However, I'm
guessing that's not going to be possible.
Thanks. So gpgsm is started but terminates before it can actually receive a
command from GPGME. Can you please add
log-file /bar/bar/gpgsm.log
debug 1024
verbose
to ~/.gnupg/gpgsm.conf and run the test again. Please post the log again.
Justus: I told you several times that we are not going to change working code
for no good reason. Even if your hack (I call it a hack because it does not
work with getsockname) would make it, it does not solve the major problem: The
inability of creating sockets on certain file systems. THAT is the major reason
why we moved to /var/run.
The whole IPC thing is pretty complex and adding a non-standard hack as proposed
by Justus will for sure cause breakage on some platforms.
I'm not sure why you call it a hack. I've been looking at POSIX, [0] introduces
pathname resolution, and the terms 'relative path' and 'absolute path'.
0: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_13
Neither the page for connect [1], nor the one for bind [2] state that the path
used to connect/bind unix sockets must be an absolute path.
1: http: / / pubs.opengroup.org/onlinepubs/9699919799/functions/connect.html#
2: http: / / pubs.opengroup.org/onlinepubs/9699919799/functions/bind.html#
Furthermore, my test across a wide range of UNIX implementations did not show
any issues with using relative paths.
Ok, closing this bug. Feel free to reopen it if you reconsider.
Fixed in 88f1505f0613894d5544290a170119eb538921e5.
I've run mutt as suggested on the troublesome email; the resulting log is attached.
To debug this you need to run mutt like this:
GPGME_DEBUG=9:gpgme.trace: mutt
The trace file will be pretty verbose but contains everything GPGME sees from
the engine.
Mar 20 2017
Unfortunately I'm unable to test this properly, because the patches can't be
applied properly to 1.8.0 (I need to add them to the package).
Are they sending ASCII armored files (those with "-----BEGIN PGP MESSAGE-----")
of binary data?
They might have used the -t (--textmode) option and removed that.
But more likely is that this is one of the usual CR,LF problems. For example
when using FTP, and sending binary data, it is important to switch to binary
mode first. WIthout looking at the data it is hard to help.
FYI this is:
Skip tests if GnuPG is too old.
Use 'gpg-agent --allow-loopback-pinentry' if applicable.
DUPLICATE