Bug 3403 - ODBC 3.0 return code (SQL_NO_DATA) not supported for searched update/delete SQL calls
: ODBC 3.0 return code (SQL_NO_DATA) not supported for searched update/delete S...
Status: RESOLVED FIXED
: Replica Location
RLS
: development
: PC All
: P3 normal
: ---
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2005-05-23 16:50 by
Modified: 2005-10-26 20:02 (History)


Attachments
Patch to db.c to accept SQL_NO_DATA and SQL_NO_DATA_FOUND on updates (75.75 KB, patch)
2005-06-15 17:42, Rob S
Details
Replacement for db.c to accept SQL_NO_DATA for updateref insertions. (75.73 KB, patch)
2005-06-16 17:27, Rob S
Details
Change SQLOK to SQLUPDATEOK after SQL update and delete queries. (12.42 KB, patch)
2005-06-17 10:18, Alexander V. Inyukhin
Details
Patch from GT 4.0.1 release to the CVS commit for this fix (4.93 KB, patch)
2005-10-26 20:02, Rob S
Details


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2005-05-23 16:50:23
| I installed GT4 with RLS using PostgreSQL ODBC driver.
| When I try to add an entry to the server using
| "globus-rls-cli create 1234 gsiftp://client.grid/tmp/file
| rls://client.grid/",
| I get "globus_rls_client: DB error" error message
| and "odbc_error: SQLGetDiagRec failed: 100" in the logs.
| 
| ODBC trace shows, that failed SQL query is
| "update t_pfn set ref = ref + 1 where name =
| 'gsiftp://client.grid/tmp/file'".
| This operation returns SQL_NO_DATA_FOUND,
| but updateref function (replica/rls/server/db.c:2145)
| do not accept it.
| 
| Is it RLS or PostgreSQL issue? Did I miss something?
| I didn't find similar problems with RLS by other users,
| but SQL_NO_DATA_FOUND seems to be correct value here.

According to, http://msdn.microsoft.com/library/default.asp?url=/library/en-
us/odbc/htm/odbcreturning_sql_no_data.asp, it depends on the combination of 
versions between the odbc app and odbc driver. This is a bit confusing to me 
to. I'm not sure which return code RLS should expect but it looks like a 
simple change to RLS to differentiate between RCs pertaining to queries versus 
searched updates/deletes could help the situation.
------- Comment #1 From 2005-06-15 17:42:55 -------
Created an attachment (id=641) [details]
Patch to db.c to accept SQL_NO_DATA and SQL_NO_DATA_FOUND on updates

The file can be dropped in replica/rls/server/ and then rebuild/install the
server.
------- Comment #2 From 2005-06-15 17:45:32 -------
I'd really appreciate it if someone can try this patch out and confirm that 
the patch corrects the bug. It will take me longer if I have to setup a pgsql 
environment and recreate the bug first. thanks
------- Comment #3 From 2005-06-16 17:07:45 -------
Is it checked into the branch? The rebuild/install is a very vague description.
I only have instruction how to build a branch (or, optionally, trunk) from CVS,
and it takes 2.5 hours. 

[... some hours later ...]

OK, I've manually moved the db.c file after running configure and before running
make. It appears to make things move smoother: 

rls> create test me
globus_rls_client: PFN doesn't exist: me

The server does not get stuck any more :-),
but it does not do the "right thing (tm)" :-(

2005-06-16 17:02:55 T213006: rli_bfupdates:
2005-06-16 17:02:55 T213006: lock_get: 806AA00 readlock
2005-06-16 17:02:55 T213006: lock_get(806AA00): 1 readers
2005-06-16 17:02:55 T213006: lock_release(806AA00): 0 readers
2005-06-16 17:02:55 T213006: lock_get: 806AA60 readlock
2005-06-16 17:02:55 T213006: lock_get(806AA60): 1 readers
2005-06-16 17:02:55 T213006: lock_release(806AA60): 0 readers
2005-06-16 17:02:55 T196621: lock_get: 806AA00 readlock
2005-06-16 17:02:55 T196621: lock_get(806AA00): 1 readers
2005-06-16 17:02:55 T196621: rli_llupdates: No unsynced RLI servers
2005-06-16 17:02:55 T196621: lock_release(806AA00): 0 readers
2005-06-16 17:03:17 T32771: auth_getperms(/DC=org/DC=doegrids/OU=People/CN=Jens-
S. Voeckler 45904): localuser voeckler perms FFFF
2005-06-16 17:03:17 T32771: authcb: Accepted connection from /DC=org/DC=doegrids
/OU=People/CN=Jens-S. Voeckler 45904
2005-06-16 17:03:25 T163851: lrc_bfiupdates: 0
2005-06-16 17:03:28 T81926: db_open: lrc1000 dbuser
2005-06-16 17:03:28 T81926: db_open: rli1000 dbuser
2005-06-16 17:03:28 T81926: db_lrc_create: test me
2005-06-16 17:03:28 T81926: PFN doesn't exist: me

I think it's progress. We are just not "there" yet. 
------- Comment #4 From 2005-06-16 17:27:33 -------
Created an attachment (id=643) [details]
Replacement for db.c to accept SQL_NO_DATA for updateref insertions.

Ouch. The last patch was wrong.

I've made the correction which I am more confident will work. This one only
accepts the SQL_NO_DATA return code when calling updateref() with the insert
flag set.

This happens when a user does a 'create mylfn mypfn' for a 'mypfn' that does
not yet exist.

Hope this really resolves the problem!
------- Comment #5 From 2005-06-16 17:36:42 -------
Also, to answer the question about where the 'fix' is located. It's part of 
the bugzilla record as an attachment:

http://bugzilla.globus.org/bugzilla/attachment.cgi?id=643&action=view

Replace "replica/rls/server/db.c" with this code and build. If you have an 
expanded source tree you can just replace the "db.c" file with this source and 
run your build again.

If not, you can also:

'tar zxvf' the globus_rls_server-4.0.tar.gz package,
cd into it,
replace the db.c file, and
$GPT_LOCATION/sbin/gpt-build -force -verbose gcc32dbgpthr
------- Comment #6 From 2005-06-17 08:09:09 -------
When db returns "no data", SQLRowCount call will fail and whole transaction
will
be rolled back without db update, so new entry will not be inserted anyway.

I think the code should looks like this:

  if (!SQLUPDATEOK(r)) {        // Handle errors
    odbcerr(r, SQL_HANDLE_STMT, h->stmt, errmsg);
    return GLOBUS_RLS_DBERROR;
  }
  if (SQLOK(r)) {               // Count rows only if we got data
    SQLRowCount(h->stmt, &rows);
    if (rows > 0)
      goto ok;
  }

This code works for me.
------- Comment #7 From 2005-06-17 08:57:14 -------
Additionally delete operations must be checked against "no data" result.
------- Comment #8 From 2005-06-17 10:12:55 -------
Before I compile the next batch and spent hours on it (I only know how to
compile GT 4.0.1, not subcomponents), can somebody please attach to this bug
report the corrections to the corrections as db.c file? 

Thanks.
------- Comment #9 From 2005-06-17 10:18:45 -------
Created an attachment (id=644) [details]
Change SQLOK to SQLUPDATEOK after SQL update and delete queries.
------- Comment #10 From 2005-06-17 10:25:00 -------
Sorry, I forget to note that diff is against previous db.c replacement
http://bugzilla.globus.org/bugzilla/attachment.cgi?id=643&action=view

With this patch I can create and delete mappings and attributes. Is there a
testsuite for for RLS?
------- Comment #11 From 2005-06-17 11:00:32 -------
Actually, Rob was posting the complete db.c file, not a diff. 
------- Comment #12 From 2005-06-17 14:52:24 -------
I definitely agree with comment #6,

http://bugzilla.globus.org/bugzilla/show_bug.cgi?id=3403#c6

I think the code should be reworked that way or something similar. (Though I'm 
a bit confused about why SQLRowCount wouldn't have caused a rollback with 
earlier odbc drivers, since that line would have been executed in the past 
when no row had been updated... Anyway, I'll make the change.)

However, as for comment #7, I'm not yet convinced that SQLUPDATEOK(r) should 
be applied to all deletes. Since the ODBC spec states that:

"...a searched update or delete statement was executed but did not affect any 
rows at the data source, the ODBC 3.x driver should return SQL_SUCCESS. ...an 
ODBC 3.x application working with an ODBC 3.x driver ... should return 
SQL_NO_DATA..."

I think it should be applied on a case-by-case basis where the delete is not 
necessarily expected to actually delete an entry. For instnace, the delete 
during when expiring LFN entries at an RLI -- may or may not delete any since 
no LFN may be past its expiration time. On the other hand, when deleting an 
LFN-PFN mapping (e.g., 'delete mylfn mypfn') that at least the delete from the 
mapping table should be expected to return SUCCESS rather than NO_DATA.

Having said that, I do agree that the SQLUPDATEOK should be applied to at 
least some of the 'deletes' -- thanks for making a point of that.

I'm going to need to take a bit more time to go through each 'update' 
and 'delete' statement more closely. Maybe on monday I'll have something.
------- Comment #13 From 2005-06-17 15:14:26 -------
Jens,

Regarding comment #8 -- I hope these instructions will be helpful in reducing 
the time to apply an update/patch.

% export GLOBUS_LOCATION=/path/to/globus/install
% export GPT_LOCATION=$GLOBUS_LOCATION
% cd /tmp
% cvs co -r globus_4_0_branch replica/rls/server
% cd replica/rls/server
% ./bootstrap

 -- replace db.c with the updated version, or patch it with the diff --

% $GPT_LOCATION/sbin/gpt-build -force -verbose gcc32dbgpthr

This should work for you to allow you to build a single component rather than 
all of gt 4.0.1.
------- Comment #14 From 2005-06-17 15:21:46 -------
I just tried to keep behavior independent from ODBC version.

I think, if delete query expects some data be actually deleted,
then NO_DATA reply is not a DB error - it is probably a RLS logic error,
and it should be handled and reported appropriately.
There are SQLRowCount checks after some delete queries,
they will work as earlier (including (rows != 1) checks).

So, SQLUPDATEOK can be checked after all delete queries
to handle DB errors, and additional reply analysis may exists
to checks internal RLS invariants.
------- Comment #15 From 2005-06-17 17:50:04 -------
I've tried Rob's new patch, but it had a problem when I tried to create a pair.
This could have been an artefact from the previous failure where something failed: 

rls> create test this
globus_rls_client: PFN doesn't exist: this

I then used Alexander's patch, and now my RLS appears to work for me:

rls> create that this
rls> q w l l *
  that: this

Even does its RLI updates:

Version:    4.0
Uptime:     00:00:58
LRC stats
  update method: lfnlist
  update method: bloomfilter
  updates lfnlist:     rls://xxxx last 06/17/05 17:43:33
  lfnlist update interval: 86400
  bloomfilter update interval: 900
  numlfn: 1
  numpfn: 1
  nummap: 1
RLI stats
  updated by: rls://xxx last 06/17/05 17:44:19
  updated via lfnlists
  numlfn: 1
  numlrc: 1
  numsender: 1
  nummap: 1

I believe I can now start to migrate our RLS 2.1.5 servers :-)
------- Comment #16 From 2005-06-17 20:31:50 -------
(From update of attachment 643 [details])
Alexander's patch should be used instead, so I'm obsolete'ing my db.c
replacement.
------- Comment #17 From 2005-06-20 16:49:36 -------
Is it checked into the trunk or branch? 
------- Comment #18 From 2005-06-20 16:58:10 -------
No, I won't be committing any changes until I can spend more time on it -- the 
issue of when/where to accept this rc from odbc requires a more detailed 
review of the sql queries in rls.
------- Comment #19 From 2005-06-24 01:24:52 -------
Is the attachment #643 [details] from comment #9 latest db.c file version with
Alexander's
patch?
------- Comment #20 From 2005-10-06 10:35:55 -------
Rob, have you found the time to verify and check in the patches posted on this
bug? Without it, RLS will compile but not work with PostGreSQL, contradicting
the claim on the GT documentation pages. Since I am a Pg guy, I would really
love to see the patches, which appear to work well in my installation, in the GT
repository, so that the community can benefit from it, too. 
------- Comment #21 From 2005-10-10 19:08:43 -------
Unfortunately, I have not been able to work on verifying the patch and moving 
it into CVS. I am trying to get to it now.
------- Comment #22 From 2005-10-26 19:58:52 -------
Fix now committed to CVS repository HEAD. It is nearly identical to the last
patch in this record.
------- Comment #23 From 2005-10-26 20:02:17 -------
Created an attachment (id=723) [details]
Patch from GT 4.0.1 release to the CVS commit for this fix

This is a patch for anyone using GT 4.0.1 release of RLS. This patch does not
take any previous patch into consideration -- it is a diff from the 4.0.1
release to the CVS commit for this fix.