Showing posts with label model. Show all posts
Showing posts with label model. Show all posts

Friday, March 30, 2012

How to get the Training Error from the Model?

Hi, everyone here.

I am trying to get the training error of the model processed which can reveal how much the model fits the cases. The training error can reveal how many cases (from training set) are classified correctly. The lower traing error is, the more the model fits the training set. (Maybe overfitted) But I found it hard to get. I saw the life chart in AS 2005 which I am not quite understand and don't know how to code it in my program.

Is there some way to getting traing error or predicting error?

I am now using this awful way to get the training error:

select t.*,CollegeTree.CollegePlans as pred

from collegetree
prediction join
openquery(DSource,'select * from CollegePlans') as t
on CollegeTree.StudentID = t.StudentID and
...
where t.CollegePlans = CollegeTree.CollegePlans;

and then use datareader.ItemCount to get the count of cases which classified correctly.

Keyword: train error,predict error, data mining, analysis service

Instead of the lift chart, you could use the classification matrix (it is next to the lift chart in BI dev studio).

The Classification Matrix gives you the exactly the number of correct vs. incorrect classifications

|||Thx, it is what I need. I am still wondering how to program with the classification matrix. Can I just get it from SQL statement or AMO? And that classification matrix need a prediction procedure that may take too long time. Could I get the training error just after the model trained?|||Currently, there is no way to get the training error just after the model trained.
The Classification Matrix is a stored procedure call. The syntax is a bit complex (an XML encoding of the queries used to compute the matrix) and the easiest way to get the statement is to
- launch SQL Server Profiler
- Connect to the Analysis Services server
- Launch the classification matrix in the tools
- look in the profiler for a statement beginning with CALL Microsoft.AnalysisServices.System.DataMining.GetConfusionMatricesUsingDatasource

Then, you can execute that statement as any other DMX statement|||

Thx for your advice, it's really helpful.

I'm now using

Code Snippet

select flattened topcount((select [SUPPORT] from NODE_DISTRIBUTION),[SUPPORT],1) from DModel.content where NODE_TYPE=4

to export the number of positive cases for every rule, i.e. the max [support] for the leaf node( NODE_TYPE=4),

and add them up.

Code Snippet

While DSReader.Read()
pCase = pCase + DSReader(0)
End While
DMTrainingError = pCase/TotalCasesCount

I hava checked the result with the BI Classification Matrix, which prooved my idea.

The rules after processing the model are stored in a table named <Model Name>.content, and the column of NODE_DISTRIBUTION is a nested table containing the cases distribution of each class, and the prediction follows the most probable class with the max support.

So pick up the max value of support in every NODE_DISTRIBUTION and add them up, then you will get the number of positive cases. The ratio of positive cases in all cases can be used as Training Error.

How to get the Training Error from the Model?

Hi, everyone here.

I am trying to get the training error of the model processed which can reveal how much the model fits the cases. The training error can reveal how many cases (from training set) are classified correctly. The lower traing error is, the more the model fits the training set. (Maybe overfitted) But I found it hard to get. I saw the life chart in AS 2005 which I am not quite understand and don't know how to code it in my program.

Is there some way to getting traing error or predicting error?

I am now using this awful way to get the training error:

select t.*,CollegeTree.CollegePlans as pred

from collegetree
prediction join
openquery(DSource,'select * from CollegePlans') as t
on CollegeTree.StudentID = t.StudentID and
...
where t.CollegePlans = CollegeTree.CollegePlans;

and then use datareader.ItemCount to get the count of cases which classified correctly.

Keyword: train error,predict error, data mining, analysis service

Instead of the lift chart, you could use the classification matrix (it is next to the lift chart in BI dev studio).

The Classification Matrix gives you the exactly the number of correct vs. incorrect classifications

|||Thx, it is what I need. I am still wondering how to program with the classification matrix. Can I just get it from SQL statement or AMO? And that classification matrix need a prediction procedure that may take too long time. Could I get the training error just after the model trained?|||Currently, there is no way to get the training error just after the model trained.
The Classification Matrix is a stored procedure call. The syntax is a bit complex (an XML encoding of the queries used to compute the matrix) and the easiest way to get the statement is to
- launch SQL Server Profiler
- Connect to the Analysis Services server
- Launch the classification matrix in the tools
- look in the profiler for a statement beginning with CALL Microsoft.AnalysisServices.System.DataMining.GetConfusionMatricesUsingDatasource

Then, you can execute that statement as any other DMX statement|||

Thx for your advice, it's really helpful.

I'm now using

Code Snippet

select flattened topcount((select [SUPPORT] from NODE_DISTRIBUTION),[SUPPORT],1) from DModel.content where NODE_TYPE=4

to export the number of positive cases for every rule, i.e. the max [support] for the leaf node( NODE_TYPE=4),

and add them up.

Code Snippet

While DSReader.Read()
pCase = pCase + DSReader(0)
End While
DMTrainingError = pCase/TotalCasesCount

I hava checked the result with the BI Classification Matrix, which prooved my idea.

The rules after processing the model are stored in a table named <Model Name>.content, and the column of NODE_DISTRIBUTION is a nested table containing the cases distribution of each class, and the prediction follows the most probable class with the max support.

So pick up the max value of support in every NODE_DISTRIBUTION and add them up, then you will get the number of positive cases. The ratio of positive cases in all cases can be used as Training Error.

sql

Friday, March 9, 2012

How to get Recovery Model of a SQL 2000 Database?

Hi ,

Can somebody help with an SQL Statement to list the Recovery models for
all the Databases in a server. I am trying to use the status column from sysdatabases..but i am not able to get the right statement.

Any help/references asap...

Thanks in advance.

Sasi.For SQL 2000

select name, databasepropertyex(name, 'Recovery') as RecoveryModel from master.dbo.sysdatabases order by name

For SQL 2005...

select name, recovery_model, recovery_model_desc from master.sys.databases

Good luck with it...

How to get predicion for monthly intervals?

hi,

I am using timeseries algorithm.I am training my model like the following

Date StudId Perf

5/1/2005 001 99

5/10/005 001 97.6

6/1/2005 001 94

6/10/2005 001 99

6/30/005 001 96

10/1/2005 001 100

Like that.

I need prediction Output like following

Date StudId Perf

10/1/2005 001 99

11/10/005 001 97.6

12/1/2005 001 94

1/10/2006 001 99

... how to write prediction query for this.

Thanks

Karthik.

have you tried a statement like below?

SELECT FLATTENED PredictTimeSeries(Perf) FROM Model?

I assume your model has Date as KEY TIME and StuId as KEY, right?

|||yes bodgan,you are right.|||So, does the query I suggested work correctly? Or are there different kind of results that you are looking for?|||

no bodgan,

the problem is the data.please look at the data, i am using to train and mt expected prediction result.I am expecting the prediction result so that the date values should be in a monthly interval.

Now I am getting results like this

Date StuId perf

10/1/2006 001 94

10/15/2006 001 99

11/10/2006 001 100

11/31/2006 001 75 Like that.

I need

Date StuId perf

10/1/2006 001 94

11/1/2006 001 99

12/1/2006 001 100

1/1/2007 001 75 Like that.

Thanks,

Karthik

|||

I see. With the specified input, you cannot get these results. The forecasting works by predicting the next N steps of the series. In order to get the results you want, your inputs should use 10/1/2006, 11/1/2006 and so on. A solution could be to average the values, per month, for each student ID. After that, as long as your input contains one entry (per student) per month, the prediction will result in one result per month per student

Wednesday, March 7, 2012

How to get notification about completion of applying snapshot

All,
In Transactional Pull model, is there a way from any storedprocedure to
get notification from the distribution agent that the initial snapshot
has been applied on the subscriber?
This notification has to be got in my Windows forms C# application and
related UI changes done.
Can anyone help me?
Thanks,
Lavanya
Have a final job step executed on completion of the snapshot job. This could
notify the C# application. You could do this through a file existence check,
mail slots, named pipes or some other mechanism.
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"Lavanya" <lavanyah@.gmail.com> wrote in message
news:1165406832.173244.309460@.l12g2000cwl.googlegr oups.com...
> All,
> In Transactional Pull model, is there a way from any storedprocedure to
> get notification from the distribution agent that the initial snapshot
> has been applied on the subscriber?
> This notification has to be got in my Windows forms C# application and
> related UI changes done.
> Can anyone help me?
> Thanks,
> Lavanya
>
|||I would require to obtain notification from the distribution agent and
not the snapshot agent.
And since the agent runs on the subscriber for pull model, how do i
track the end of application of the first snapshot?
Thanks,
Lavanya
Hilary Cotter wrote:[vbcol=seagreen]
> Have a final job step executed on completion of the snapshot job. This could
> notify the C# application. You could do this through a file existence check,
> mail slots, named pipes or some other mechanism.
> --
> Hilary Cotter
> Looking for a SQL Server replication book?
> http://www.nwsu.com/0974973602.html
> Looking for a FAQ on Indexing Services/SQL FTS
> http://www.indexserverfaq.com
>
> "Lavanya" <lavanyah@.gmail.com> wrote in message
> news:1165406832.173244.309460@.l12g2000cwl.googlegr oups.com...
|||Am still not clear from where i can get the state of the
pull-distribution agent.
Fyi: I have a centralised read only subscriber which has Pull
transactional model configured.
1. Can i use sp_repltrans at publisher database and confirm that the
status of Log reader agent is idle if no rows are returned?
2. Can i use sp_browsereplcmds at distribution database and confirm
that the status of the distribution agent running at the subscriber is
idle if no rows are returned?
Is there any way i can find at the subscriber whether all my
publishers' distribution agents are in idle state?
Please reply asap which would help in increasing the robustness of my
application.
Thanks in advance,
Lavanya
Lavanya wrote:[vbcol=seagreen]
> I would require to obtain notification from the distribution agent and
> not the snapshot agent.
> And since the agent runs on the subscriber for pull model, how do i
> track the end of application of the first snapshot?
> Thanks,
> Lavanya
> Hilary Cotter wrote:
|||I don't believe you can use sp_repltrans as this only, nor is sp_
browsereplcmds helpful. I think you would be best off setting
historyverboselevel to 2 and then querying mdistribution_history looking for
messages like Delivered snapshot from the
'unc\ServerName_PublicationName\20061219073651\' sub-folder in 1522
milliseconds
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"Lavanya" <lavanyah@.gmail.com> wrote in message
news:1166511416.572294.66530@.80g2000cwy.googlegrou ps.com...
> Am still not clear from where i can get the state of the
> pull-distribution agent.
> Fyi: I have a centralised read only subscriber which has Pull
> transactional model configured.
> 1. Can i use sp_repltrans at publisher database and confirm that the
> status of Log reader agent is idle if no rows are returned?
> 2. Can i use sp_browsereplcmds at distribution database and confirm
> that the status of the distribution agent running at the subscriber is
> idle if no rows are returned?
> Is there any way i can find at the subscriber whether all my
> publishers' distribution agents are in idle state?
> Please reply asap which would help in increasing the robustness of my
> application.
> Thanks in advance,
> Lavanya
> Lavanya wrote:
>
|||Thanks for your response.
Just to confirm my understanding, both sp_repltrans and
sp_browsereplcmds will not help in finding the current state of
logreader and (pull)distribution agents
As you said, i set the historyverboselevel to 2 for the distribution
agent and tried to look into
MSdistribution_history table but could not find "Delivered snapshot..."
string. I found the below instead:
Applied script 'Table1_4.dri'
Applied script 'Table2_4.dri'
Applied script 'Table3_4.dri'
Applied script 'Table4_4.dri'
3 transaction(s) with 35 command(s) were delivered.
No replicated transactions are available.
Hence, can i look out for "No replicated transactions are available."
instead?
What is that 1522 milliseconds?
Also, is there a way to look into my centralized pull subscriber(SQL
2005) whether all the distribution agents for publishers(SQL 2000) are
currently idle or not?
Regards,
Lavanya
Hilary Cotter wrote:[vbcol=seagreen]
> I don't believe you can use sp_repltrans as this only, nor is sp_
> browsereplcmds helpful. I think you would be best off setting
> historyverboselevel to 2 and then querying mdistribution_history looking for
> messages like Delivered snapshot from the
> 'unc\ServerName_PublicationName\20061219073651\' sub-folder in 1522
> milliseconds
> --
> Hilary Cotter
> Looking for a SQL Server replication book?
> http://www.nwsu.com/0974973602.html
> Looking for a FAQ on Indexing Services/SQL FTS
> http://www.indexserverfaq.com
>
> "Lavanya" <lavanyah@.gmail.com> wrote in message
> news:1166511416.572294.66530@.80g2000cwy.googlegrou ps.com...
|||try looking for > Applied script 'Table4_4.dri' if that is the last thing to
be delivered in your snapshot.
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"Lavanya" <lavanyah@.gmail.com> wrote in message
news:1166594011.399191.87600@.n67g2000cwd.googlegro ups.com...
> Thanks for your response.
> Just to confirm my understanding, both sp_repltrans and
> sp_browsereplcmds will not help in finding the current state of
> logreader and (pull)distribution agents
> As you said, i set the historyverboselevel to 2 for the distribution
> agent and tried to look into
> MSdistribution_history table but could not find "Delivered snapshot..."
> string. I found the below instead:
> Applied script 'Table1_4.dri'
> Applied script 'Table2_4.dri'
> Applied script 'Table3_4.dri'
> Applied script 'Table4_4.dri'
> 3 transaction(s) with 35 command(s) were delivered.
> No replicated transactions are available.
> Hence, can i look out for "No replicated transactions are available."
> instead?
> What is that 1522 milliseconds?
> Also, is there a way to look into my centralized pull subscriber(SQL
> 2005) whether all the distribution agents for publishers(SQL 2000) are
> currently idle or not?
> Regards,
> Lavanya
>
> Hilary Cotter wrote:
>
|||Thanks Hilary for your help.
For the snapshot successful delivery confirmation, i check the status
of 'No replicated Transaction available' comment in
MSdistribution_history table from my C# application. This is more
reliable as am not sure of the number that succeeds the table name nor
am sure of the table which is the last to be delivered. This works
fine.
Now, i have another clarification.
>From my C#.Net application, i need to check whether there are any
pending transactions from publ. to subs. before i allow the user to
remove replication setup from the UI.
As previously asked,
1. Can i use sp_repltrans at publisher database, check if no rows are
returned and confirm that no data is pending to be transferred ?
2. Can i use sp_browsereplcmds at distribution database,check if no
rows are returned and hence confirm that subscriber and publisher are
in total sync ?
Reply asap.
Thanks in advance,
-Lavanya
Hilary Cotter wrote:[vbcol=seagreen]
> try looking for > Applied script 'Table4_4.dri' if that is the last thing to
> be delivered in your snapshot.
> --
> Hilary Cotter
> Looking for a SQL Server replication book?
> http://www.nwsu.com/0974973602.html
> Looking for a FAQ on Indexing Services/SQL FTS
> http://www.indexserverfaq.com
>
> "Lavanya" <lavanyah@.gmail.com> wrote in message
> news:1166594011.399191.87600@.n67g2000cwd.googlegro ups.com...
|||No, to both. You have to hack msdistribution_status for the info you are
looking for.
Query MSsubscriptions to get the values you need for your subscriber
SELECT t.article_id,s.agent_id,
'UndelivCmdsInDistDB'=SUM(CASE WHEN xact_seqno > h.maxseq THEN 1 ELSE 0
END),
'DelivCmdsInDistDB'=SUM(CASE WHEN xact_seqno <= h.maxseq THEN 1 ELSE 0 END)
FROM (SELECT article_id,publisher_database_id, xact_seqno
FROM MSrepl_commands with (NOLOCK) ) as t
JOIN (SELECT agent_id,article_id,publisher_database_id FROM MSsubscriptions
with (NOLOCK) ) AS s
ON (t.article_id = s.article_id AND
t.publisher_database_id=s.publisher_database_id )
JOIN (SELECT agent_id,'maxseq'= isnull(max(xact_seqno),0x0) FROM
MSdistribution_history with (NOLOCK) GROUP BY agent_id) as h
ON (h.agent_id=s.agent_id)
where s.publisher_database_id=57 --obtained from mssubscriptions
and s.publisher_id=0 --obtained from mssubscriptions
and s.subscriber_id=7 --obtained from mssubscriptions
and s.subscriber_db='test' --obtained from mssubscriptions
GROUP BY t.article_id,s.agent_id
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"Lavanya" <lavanyah@.gmail.com> wrote in message
news:1167212049.957146.58730@.a3g2000cwd.googlegrou ps.com...
> Thanks Hilary for your help.
> For the snapshot successful delivery confirmation, i check the status
> of 'No replicated Transaction available' comment in
> MSdistribution_history table from my C# application. This is more
> reliable as am not sure of the number that succeeds the table name nor
> am sure of the table which is the last to be delivered. This works
> fine.
> Now, i have another clarification.
> pending transactions from publ. to subs. before i allow the user to
> remove replication setup from the UI.
> As previously asked,
> 1. Can i use sp_repltrans at publisher database, check if no rows are
> returned and confirm that no data is pending to be transferred ?
> 2. Can i use sp_browsereplcmds at distribution database,check if no
> rows are returned and hence confirm that subscriber and publisher are
> in total sync ?
> Reply asap.
> Thanks in advance,
> -Lavanya
> Hilary Cotter wrote:
>