Bugzilla – Bug 3403
ODBC 3.0 return code (SQL_NO_DATA) not supported for searched update/delete SQL calls
Last modified: 2005-10-26 20:02:17
You need to log in before you can comment on or make changes to this bug.
| I installed GT4 with RLS using PostgreSQL ODBC driver. | When I try to add an entry to the server using | "globus-rls-cli create 1234 gsiftp://client.grid/tmp/file | rls://client.grid/", | I get "globus_rls_client: DB error" error message | and "odbc_error: SQLGetDiagRec failed: 100" in the logs. | | ODBC trace shows, that failed SQL query is | "update t_pfn set ref = ref + 1 where name = | 'gsiftp://client.grid/tmp/file'". | This operation returns SQL_NO_DATA_FOUND, | but updateref function (replica/rls/server/db.c:2145) | do not accept it. | | Is it RLS or PostgreSQL issue? Did I miss something? | I didn't find similar problems with RLS by other users, | but SQL_NO_DATA_FOUND seems to be correct value here. According to, http://msdn.microsoft.com/library/default.asp?url=/library/en- us/odbc/htm/odbcreturning_sql_no_data.asp, it depends on the combination of versions between the odbc app and odbc driver. This is a bit confusing to me to. I'm not sure which return code RLS should expect but it looks like a simple change to RLS to differentiate between RCs pertaining to queries versus searched updates/deletes could help the situation.
Created an attachment (id=641) [details] Patch to db.c to accept SQL_NO_DATA and SQL_NO_DATA_FOUND on updates The file can be dropped in replica/rls/server/ and then rebuild/install the server.
I'd really appreciate it if someone can try this patch out and confirm that the patch corrects the bug. It will take me longer if I have to setup a pgsql environment and recreate the bug first. thanks
Is it checked into the branch? The rebuild/install is a very vague description. I only have instruction how to build a branch (or, optionally, trunk) from CVS, and it takes 2.5 hours. [... some hours later ...] OK, I've manually moved the db.c file after running configure and before running make. It appears to make things move smoother: rls> create test me globus_rls_client: PFN doesn't exist: me The server does not get stuck any more :-), but it does not do the "right thing (tm)" :-( 2005-06-16 17:02:55 T213006: rli_bfupdates: 2005-06-16 17:02:55 T213006: lock_get: 806AA00 readlock 2005-06-16 17:02:55 T213006: lock_get(806AA00): 1 readers 2005-06-16 17:02:55 T213006: lock_release(806AA00): 0 readers 2005-06-16 17:02:55 T213006: lock_get: 806AA60 readlock 2005-06-16 17:02:55 T213006: lock_get(806AA60): 1 readers 2005-06-16 17:02:55 T213006: lock_release(806AA60): 0 readers 2005-06-16 17:02:55 T196621: lock_get: 806AA00 readlock 2005-06-16 17:02:55 T196621: lock_get(806AA00): 1 readers 2005-06-16 17:02:55 T196621: rli_llupdates: No unsynced RLI servers 2005-06-16 17:02:55 T196621: lock_release(806AA00): 0 readers 2005-06-16 17:03:17 T32771: auth_getperms(/DC=org/DC=doegrids/OU=People/CN=Jens- S. Voeckler 45904): localuser voeckler perms FFFF 2005-06-16 17:03:17 T32771: authcb: Accepted connection from /DC=org/DC=doegrids /OU=People/CN=Jens-S. Voeckler 45904 2005-06-16 17:03:25 T163851: lrc_bfiupdates: 0 2005-06-16 17:03:28 T81926: db_open: lrc1000 dbuser 2005-06-16 17:03:28 T81926: db_open: rli1000 dbuser 2005-06-16 17:03:28 T81926: db_lrc_create: test me 2005-06-16 17:03:28 T81926: PFN doesn't exist: me I think it's progress. We are just not "there" yet.
Created an attachment (id=643) [details] Replacement for db.c to accept SQL_NO_DATA for updateref insertions. Ouch. The last patch was wrong. I've made the correction which I am more confident will work. This one only accepts the SQL_NO_DATA return code when calling updateref() with the insert flag set. This happens when a user does a 'create mylfn mypfn' for a 'mypfn' that does not yet exist. Hope this really resolves the problem!
Also, to answer the question about where the 'fix' is located. It's part of the bugzilla record as an attachment: http://bugzilla.globus.org/bugzilla/attachment.cgi?id=643&action=view Replace "replica/rls/server/db.c" with this code and build. If you have an expanded source tree you can just replace the "db.c" file with this source and run your build again. If not, you can also: 'tar zxvf' the globus_rls_server-4.0.tar.gz package, cd into it, replace the db.c file, and $GPT_LOCATION/sbin/gpt-build -force -verbose gcc32dbgpthr
When db returns "no data", SQLRowCount call will fail and whole transaction will be rolled back without db update, so new entry will not be inserted anyway. I think the code should looks like this: if (!SQLUPDATEOK(r)) { // Handle errors odbcerr(r, SQL_HANDLE_STMT, h->stmt, errmsg); return GLOBUS_RLS_DBERROR; } if (SQLOK(r)) { // Count rows only if we got data SQLRowCount(h->stmt, &rows); if (rows > 0) goto ok; } This code works for me.
Additionally delete operations must be checked against "no data" result.
Before I compile the next batch and spent hours on it (I only know how to compile GT 4.0.1, not subcomponents), can somebody please attach to this bug report the corrections to the corrections as db.c file? Thanks.
Created an attachment (id=644) [details] Change SQLOK to SQLUPDATEOK after SQL update and delete queries.
Sorry, I forget to note that diff is against previous db.c replacement http://bugzilla.globus.org/bugzilla/attachment.cgi?id=643&action=view With this patch I can create and delete mappings and attributes. Is there a testsuite for for RLS?
Actually, Rob was posting the complete db.c file, not a diff.
I definitely agree with comment #6, http://bugzilla.globus.org/bugzilla/show_bug.cgi?id=3403#c6 I think the code should be reworked that way or something similar. (Though I'm a bit confused about why SQLRowCount wouldn't have caused a rollback with earlier odbc drivers, since that line would have been executed in the past when no row had been updated... Anyway, I'll make the change.) However, as for comment #7, I'm not yet convinced that SQLUPDATEOK(r) should be applied to all deletes. Since the ODBC spec states that: "...a searched update or delete statement was executed but did not affect any rows at the data source, the ODBC 3.x driver should return SQL_SUCCESS. ...an ODBC 3.x application working with an ODBC 3.x driver ... should return SQL_NO_DATA..." I think it should be applied on a case-by-case basis where the delete is not necessarily expected to actually delete an entry. For instnace, the delete during when expiring LFN entries at an RLI -- may or may not delete any since no LFN may be past its expiration time. On the other hand, when deleting an LFN-PFN mapping (e.g., 'delete mylfn mypfn') that at least the delete from the mapping table should be expected to return SUCCESS rather than NO_DATA. Having said that, I do agree that the SQLUPDATEOK should be applied to at least some of the 'deletes' -- thanks for making a point of that. I'm going to need to take a bit more time to go through each 'update' and 'delete' statement more closely. Maybe on monday I'll have something.
Jens, Regarding comment #8 -- I hope these instructions will be helpful in reducing the time to apply an update/patch. % export GLOBUS_LOCATION=/path/to/globus/install % export GPT_LOCATION=$GLOBUS_LOCATION % cd /tmp % cvs co -r globus_4_0_branch replica/rls/server % cd replica/rls/server % ./bootstrap -- replace db.c with the updated version, or patch it with the diff -- % $GPT_LOCATION/sbin/gpt-build -force -verbose gcc32dbgpthr This should work for you to allow you to build a single component rather than all of gt 4.0.1.
I just tried to keep behavior independent from ODBC version. I think, if delete query expects some data be actually deleted, then NO_DATA reply is not a DB error - it is probably a RLS logic error, and it should be handled and reported appropriately. There are SQLRowCount checks after some delete queries, they will work as earlier (including (rows != 1) checks). So, SQLUPDATEOK can be checked after all delete queries to handle DB errors, and additional reply analysis may exists to checks internal RLS invariants.
I've tried Rob's new patch, but it had a problem when I tried to create a pair. This could have been an artefact from the previous failure where something failed: rls> create test this globus_rls_client: PFN doesn't exist: this I then used Alexander's patch, and now my RLS appears to work for me: rls> create that this rls> q w l l * that: this Even does its RLI updates: Version: 4.0 Uptime: 00:00:58 LRC stats update method: lfnlist update method: bloomfilter updates lfnlist: rls://xxxx last 06/17/05 17:43:33 lfnlist update interval: 86400 bloomfilter update interval: 900 numlfn: 1 numpfn: 1 nummap: 1 RLI stats updated by: rls://xxx last 06/17/05 17:44:19 updated via lfnlists numlfn: 1 numlrc: 1 numsender: 1 nummap: 1 I believe I can now start to migrate our RLS 2.1.5 servers :-)
(From update of attachment 643 [details]) Alexander's patch should be used instead, so I'm obsolete'ing my db.c replacement.
Is it checked into the trunk or branch?
No, I won't be committing any changes until I can spend more time on it -- the issue of when/where to accept this rc from odbc requires a more detailed review of the sql queries in rls.
Is the attachment #643 [details] from comment #9 latest db.c file version with Alexander's patch?
Rob, have you found the time to verify and check in the patches posted on this bug? Without it, RLS will compile but not work with PostGreSQL, contradicting the claim on the GT documentation pages. Since I am a Pg guy, I would really love to see the patches, which appear to work well in my installation, in the GT repository, so that the community can benefit from it, too.
Unfortunately, I have not been able to work on verifying the patch and moving it into CVS. I am trying to get to it now.
Fix now committed to CVS repository HEAD. It is nearly identical to the last patch in this record.
Created an attachment (id=723) [details] Patch from GT 4.0.1 release to the CVS commit for this fix This is a patch for anyone using GT 4.0.1 release of RLS. This patch does not take any previous patch into consideration -- it is a diff from the 4.0.1 release to the CVS commit for this fix.