SQLSaturday 764 – Slovakia – Bratislava

Hello folks / Dobrý deň, priatelia

Next Saturday (06/23th ) is time for Slovakia host the SQLSaturday, and I’m happy to say that I’ll be speking :).

It will be my first time visiting Slovakia, so I’m looking forward for it :), I have 2 sessions :), the first one will be “How to build solution for High Availability/Disaster Recovery” and the second one “High availability for SQL Server using Azure as DR site“.

So, in case you are arund Bratislava by Saturday 23th, show up at the event, it will be a pleasure to meet you there.

As you know, SQLSaturday is totally free 🙂 so hurry up and register your self

The schedule:

Room Borneo Madagascar
08:00 AM – 09:00 AM Registration
09:00 AM – 09:15 AM Welcome
09:15 AM – 10:15 AM
10:15 AM – 10:30 AM Coffee Break
10:30 AM – 11:30 AM
11:30 AM – 11:45 AM Coffee Break
11:45 AM – 12:45 PM
12:45 PM – 01:45 PM Lunch Break
01:45 PM – 02:45 PM
02:45 PM – 03:00 PM Coffee Break
03:00 PM – 04:00 PM
04:00 PM – 04:15 PM Coffee Break
04:15 PM – 05:15 PM
05:15 PM – 05:45 PM Raffle

See you there!

Anúncios

AlwaysOn – Distributed AG

Hello Friends!

Distributed availability groups was introduced in SQL Server 2016.

Note
DAG” is not the official abbreviation for distributed availability group, because the abbreviation is already used for the Exchange Database Availability Group feature. This Exchange feature has no relation to SQL Server availability groups or distributed availability groups.

The DAG came to allow us to spans two separates AG (configured on two different Windows Failover Clustering). Also, the Availability group that participates at DAG do not need to be hosted at the same location, they can be a  physical, virtual, on-premises, in the public cloud, or anywhere that supports an availability-group deployment. As long as two availability groups can communicate, you can configure a distributed availability group with them.

Opposite of traditional AG that has resources configured at Windows Server Failover Clustering (WSFC), the DAG do not store resources at WSFC, all information about DAG it is stored at SQL Server

We can use the DAG for this three main usage scenarios

  • Disaster recovery and easier multi-site configurations
  • Migration to new hardware or configurations, which might include using new hardware or changing the underlying operating systems
  • Increasing the number of readable replicas beyond eight in a single availability group by spanning multiple availability groups

How it works?

DAG will provide to the application the possibility to connect to a read-only replica with different listener.
As mentioned before, each server will have their own WSFC (also can be at different domains)

dag1

As we can see at above image, we can create a DAG between 2 AG (or more), the the AG will consider the listener as nodes, so it is not possible to create DAG between the server name, we must use the virtual name (as CAP on FCI)

In this scenario, the Global Primary will send the logs to Primary node at second AG (called as forwarder) and this forwarder will send the logs to

You can configure the data movement in distributed availability groups as synchronous or asynchronous. However, data movement is slightly different within distributed availability groups compared to a traditional availability group. Although each availability group has a primary replica, there is only one copy of the databases participating in a distributed availability group that can accept inserts, updates, and deletions. As shown in the above image, AG1 is the primary availability group. Its primary replica sends transactions to both the secondary replicas of AG1 and the primary replica of AG2. The primary replica of AG2 is also known as a forwarder. A forwarder is a primary replica in a secondary availability group in a distributed availability group. The forwarder receives transactions from the primary replica in the primary availability group and forwards them to the secondary replicas in its own availability group.

Considerations

It is important to know that DAG will keep just one replica writable, the remains replica it is Read-only!

We also we can create DAG from DAG, this off course will create some complexity for DBA’s administrations.

dag2

Implementation instructions

Implementation prerequisites

  • We can only create DAG between AG and should use the listener as Nodes.
  • Distributed availability groups can not be configured with Standard edition or mix of Standard and Enterprise edition.

Technical instructions

To create a DAG you can follow this example:
OBS: Note that the code are using automatic seeding

--At primary
CREATE AVAILABILITY GROUP [<name of distributed AG>]
WITH (DISTRIBUTED)
AVAILABILITY GROUP ON ‘<name of first AG>’ WITH
( LISTENER_URL = ‘tcp://<name of first AG Listener>:5022’,
AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,
FAILOVER_MODE = MANUAL,
SEEDING_MODE = AUTOMATIC ),
‘<name of second AG>’ WITH
( LISTENER_URL = ‘tcp:// <name of second AG Listener>:5022’,
AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,
FAILOVER_MODE = MANUAL,
SEEDING_MODE = AUTOMATIC );

At secondary AG, now let’s join it to DAG

ALTER AVAILABILITY GROUP [<name of the distributed AG>]
join
AVAILABILITY GROUP ON
‘<name of the first AG>’ WITH
( LISTENER_URL = ‘tcp://<name of first AG Listener>:5022’,
AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,
FAILOVER_MODE = MANUAL,
SEEDING_MODE = MANUAL  ),
‘<name of the second AG>’ WITH
( LISTENER_URL = ‘tcp:// <name of second AG Listener>:5022’,
AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,
FAILOVER_MODE = MANUAL,
SEEDING_MODE = AUTOMATIC );

Limitations

  • Only manual failover is supported for a distributed availability group. In a disaster recovery situation where you are switching data centers, you should not configure automatic failover (with rare exceptions).
  • You need monitor network latency at a different layer for the data transport. The difference is that each WSFC cluster maintains its own availability.
  • We recommend asynchronous data movement, because this approach would be for disaster-recovery purposes.
  • If you configure synchronous data movement between the primary replica and at least one secondary replica of the second availability group, and you configure synchronous movement on the distributed availability group, a distributed availability group will wait until all synchronous copies acknowledge that they have the data.

Monitoring DAG

We can use the Dashboard at SSMS to monitor DAG as we do for traditional AG, and also we can query the DMV’s

SELECT ag.[nameas 'AG Name', ag.is_distributed, ar.replica_server_name as 'Underlying AG', ars.role_desc as 'Role', ars.synchronization_health_desc as 'Sync Status'
FROM    sys.availability_groups ag,
sys.availability_replicas ar,      
sys.dm_hadr_availability_replica_states ars      
WHERE   ar.replica_id = ars.replica_id
and     ag.group_id = ar.group_id
and ag.is_distributed = 1

 


References

SQLSaturday #707 Italy – I’m going

sqlsat707_header

Ciao amici!

Next Friday February, 17th,2018 Pordenone, Italy will host next SQL Saturday #707 edition.

It will be perfect geek Saturday,  learning and making network from the Data Platform Experts.

My session will start at 03:40PM, So, if you are like me and want to explore Italy 🙂 , join us on 17th Feb.

It is a free event, all the details can be found here: http://www.sqlsaturday.com/707/eventhome.aspx

See you there!

Marcelo Fernandes

How to test your workload before SQL Migration

Hi Friends,

Today I’ll talk about how to test SQL Workloads before a SQL Migration or to test new features. for this article, let’s suppose that I have a SQL 2012 Instance and I have been asked to switch to SQL 2016, before moving in, I want to test my workloads on SQL 2016 to be sure that I won’t have performance problems with or breaking changes.
The goal is:
– Capture workload from SQL2012
– Replay the workload on SQL2016
– Generate reports with results.

We have some ways to reach the goal, for example:
– Backup from SQL2012 and Restore on SQL2016, change the application connection to SQL2016;
– Use Distributed Replay do simulate workload.
– etc….

The first option we will have downtime for business, since we will point the application just for test and we don’t know what will happen. this can also be dangerous if we have third party integration (we use other application to share or get information, probably we will have data loss to roll back to SQL2012)

The second option is good option, since SQL 2012 we can use Distributed Replay, although this is a command line tool, will just have the replay and you will spend some time to crate the command to capture and replay, we still need to create the report manually.

Now we have a fantastic tool DEA (Database Experimentation Assistant) that is new tool to capture workload form A server and replay on B server. DEA support previous version of SQL Server (2005 and above), so if you want to upgrade a very old instance, you can just use this tool to be a step ahead of possible issues (performance, breaking changes, etc.).

DEA will also provide a very nice report with analysis metric for compatibility errors, degraded queries, query plans, etc., this will allow us to be more confident to do a successful upgrade.

How to use DEA

First step, download the DEA and install it.

The second step, is to setup Distributed Replay, behind the scenes DEA uses Distributed Replay, DEA will provide a user-friendly interface to capture and replay the workloads.

After DR and DEA Setup, we must to have a backup from our Database (initial position), we will restore this backup on new Server, so we will have the same point to replay the workload.

After backup you can start the DEA you should receive a welcome screen like this:

On the left side we have the menu, the first step is to start a workload capture on Source server, in this case SQL2012, so just click on camera icon.

Just click on +New Capture.

At this screen you must enter the Trace Name, Duration (starting for 5 min up to 3 hours), Source SQL Instance, Source Database name, Path to save the trace and a confirmation check box that you made a backup before start the capture (we will restore this backup on SQL2016 before replay, so SQL will compare apples with apples)

You can check the status while SQL is running the trace

After workload capture, we will be able to replay the workload on destination server (SQL2016), to start replaying the workload, click on Play icon on left menu and click over +New Replay:

Now before start replaying the workload, we must restore the backup that we executed before starting the trace (initial position) after restore, you must inform the Replay Name, Distributor Controller Machine, the Path with source trace (the same that we informed on capture trace screen), target SQL Server name, Path to save the trace during the replay (trace on SQL2016) and the check box to inform that we performed a database restore (initial position)

We need to wait for replay, during this phase we can check the status:

In the end, you should have a screen like this:

Now it is time for report, just click on Report icon on left menu and type the name of the server to host the Report and click over connect button:

Click on button +New Analysis Report, if is the first time and you do not have R for Windows and R Interop installed, you will receive a screen informing that you must install it.

After R setup, click over Try Again button and you will receive a screen to setup the Report Analysis, type a name for Report, the source trace file (SQL2012) and Target trace file (SQL2016) and click on Start

Wait for analysis completion

After the analysis, you will receive a dashboard report comparing the Source and Target server, you can click on graphs to drill down into the details, in this case will click on green area.

The detailed report will show all query text and the duration analysis, we also can click on it to see more details.

In this detailed report we will see the performance comparison between the executions, information about compatibility errors (breaking changes) and execution plans.

So that’s all for now, what do you think? Now it is easy to replay workloads before an upgrade isn’t it?

These articles may help you:
Setup Distributed Replay
Distributed Replay Docs
Distributed Replay Troubleshooting
DEA Capture Trace FAQ
DEA Replay FAQ
DEA Report Analysis FAQ
DEA Solution architecture to compare workloads

Thanks

Marcelo Fernandes

SQL 24 HOP – Palestrarei no SQL PASS 24 horas português

Olá Amigos

No próximo dia 14/11 (segunda-feira) iniciaremos mais um ciclo de palestras com diversos profissionais de SQL Server dos países que falam a língua portuguesa.

A minha sessão será no dia 14/11 as 19:00 (Horário de Brasilia) vou falar de SMB 3.0 como alternativa de storage. Corra, ainda da tempo de fazer a inscrição totalmente gratuíta e online! REGISTRE-SE

Veja a grade de palestras aqui.

 

Token Bloat – Cannot generate SSPI context

Olá amigos, há um tempo atrás enfrentei o famoso problema SSPI atípico e sempre adiava este post, bom finalmente decidi escrever o artigo. Espere que o ajude.

Problema

Um usuário sempre conectou ao SQL Server sem problemas, mas um dia ao tentar conectar-se ao mesmo serviço de SQL ele recebeu a seguinte mensagem:

sspi

Não houve nenhuma mudança por parte de infraestrutura e DBAs no servidor, o usuário simplesmente não consegue conectar mais, e o mais intrigante é que alguns usuários conseguem conectar-se e outros não.

Bom, na grande maioria das vezes o erro SSPI Context é gerado por falta de SPN em uma tentativa de conexão via kerberos com o SQL Server e para resolver este problema basta criar o SPN.

Listando os SPN atuais:

No prompt de comando digite o seguinte comando

SETSPN –L <domínio\conta_de_serviço_do_SQL>

spn

Nota: Se ao executar o comando setspn –L você não receber o SPN para a sua instância de SQL Server, você deve criar o SPN e você está recebendo o erro de SSPI Context, possivelmente o seu problema será resolvido com a criação manual do SPN com o seguinte comando:

SETSPN – A MSSQLSvc/fqdn <domínio>\<conta de serviço do SQL>

SETSPN – A MSSQLSvc/fqdn:1433 <domínio>\<conta de serviço do SQL>

Ex. SETSPN –A MSSLSvc/sqlproducao.contoso.com contoso\sqlServiceAcc

Maiores informações no artigo: https://msdn.microsoft.com/pt-br/library/ms191153.aspx

 

Como podemos observar no resultado do SETSPN –L no meu ambiente eu tenho os SPNs corretamente criados!

O que justifica algumas conexões conseguirem conectar-se, pois o SPN está ok, e então por que algumas conexões recebem o SSPI?

Para responder esta questão precisamos analisar o ticket do kerberos, executando a ferramenta tokensz você poderá evidenciar o problema de token bloat com a seguinte sintaxe

tokensz.exe /compute_tokensize /user:<usuário_com_erro_sspi>

tokensz-1

Como podemos notar na imagem acima, o tamanho máximo do Token é 12000 (12K) e o token do usuário contoso\usuarioSQL que estou tentando conectar-se ao SQL e estou recebendo o erro SSPI contexto ultrapassou o limite.

Um token basicamente contém os grupos e permissões de um usuário e é criado no momento do logon e este token é repassado aos outros serviços/servidores conforme o usuário necessita autenticar-se para consumir os serviços (para maiores detalhes sobre kerberos consulte o artigo https://msdn.microsoft.com/en-us/library/bb742516.aspx)

Pois bem, se em um token temos a ACL (Access control List) vamos investigar os usuários criados:

Os dois usuários são membros do grupo SQLacesso o qual foi criado o login no SQL Server, mas o segundo usuário é membro de mais algumas centenas de grupos.

Até o Windows 2008 R2 o tamanho máximo default de um token é de 12K a partir do Windows 2012 este valor foi alterado para 48K

Dependendo do tamanho de seu ambiente este limite de 12K é facilmente alcançado com um usuário pertencendo a aproximadamente 150 grupos.

Em uma empresa multinacional onde como boa prática são criados diversos grupos para controlar acessos a diversos compartilhamentos, servidores, aplicações etc. este limite é alcançado muito rapidamente.

aninhamento_grupo.png

Qual o tamanho de cada grupo?

Para calcular o tamanho do token utilizamos a seguinte formula:

TokenSize = 1200 + 40d + 8s

Onde:

D = (grupos domínio local + grupos universais externos)

S = (grupos globais + Universal)

Existem diversos scripts que automatizam este cálculo, abaixo um bem simples que encontrei no artigo http://www.cluberti.com/blog/2014/05/26/getting-kerberos-token-size-with-powershell/

# Always credit where due - this was found via
# http://jacob.ludriks.com/getting-kerberos-token-size-with-powershell/
 
#Gets max token size
#Run with .\get_tokensize.ps1 -Username "domain\username"
#Reference: http://support.microsoft.com/kb/327825
#tokensize = 1200 + 40d + 8s
Param(
    [Parameter(Mandatory=$True)]
    [String]$Username
)
$domain = ($username.split("\"))[0]
$user = ($username.split("\"))[1]
Import-Module ActiveDirectory
$rootdse = (Get-ADDomain $domain).distinguishedname
$server = (Get-ADDomain $domain).pdcemulator
$usergroups = Get-ADPrincipalGroupMembership -server $server $user | select distinguishedname,groupcategory,groupscope,name
$domainlocal = [int]@($usergroups | where {$_.groupscope -eq "DomainLocal"}).count
$global = [int]@($usergroups | where {$_.groupscope -eq "Global"}).count
$universaloutside = [int]@($usergroups | where {$_.distinguishedname -notlike "*$rootdse" -and $_.groupscope -eq "Universal"}).count
$universalinside = [int]@($usergroups | where {$_.distinguishedname -like "*$rootdse" -and $_.groupscope -eq "Universal"}).count
$tokensize = 1200 + (40 * ($domainlocal + $universaloutside)) + (8 * ($global + $universalinside))
Write-Host "
Domain local groups: $domainlocal
Global groups: $global
Universal groups outside the domain: $universaloutside
Universal groups inside the domain: $universalinside
Kerberos token size: $tokensize"

Outros scripts úteis

http://www.jhouseconsulting.com/2013/12/20/script-to-create-a-kerberos-token-size-report-1041

https://gallery.technet.microsoft.com/scriptcenter/Check-for-MaxTokenSize-520e51e5

Como Resolver

Você tem duas maneiras de resolver o problema, aumentando o default do token size ou excluindo grupos desnecessários.

Para aumentar o token size, você pode seguir as etapas do KB http://support.microsoft.com/kb/938118

Em cada estação:

  1. Inicie Regedt32.exe
  2. Localize a chave (HKLM\System\CurrentControlSet\Control\Lsa\Kerberos\Parameters)
  3. No menu Edit clique em NEW / DWORD e utilize os parâmetros abaixo:

Nome: MaxTokenSize

tipo: REG_DWORD

Base: Decimal

Valor: 48000

regedit

  1. Feche o Editor de Registro

Se executarmos novamente a ferramenta Tokensz teremos o seguinte resultado:

tokensz-2

 

Leituras adicionais

http://support.microsoft.com/kb/938118

http://blogs.technet.com/b/shanecothran/archive/2010/07/16/maxtokensize-and-kerberos-token-bloat.aspx

http://support.microsoft.com/kb/327825/en-us

http://blogs.technet.com/b/askds/archive/2012/07/27/kerberos-errors-in-network-captures.aspx

http://blogs.technet.com/b/askds/archive/2007/11/02/what-s-in-a-token.aspx

http://support.microsoft.com/kb/327825/en-us

http://blogs.technet.com/b/askds/archive/2012/09/12/maxtokensize-and-windows-8-and-windows-server-2012.aspx

 

 

 

Escalando a montanha – 3º ano como MVP

O frio na barriga ainda continua o mesmo :) de quando recebi o título pela primeira vez em 2014, o dia todo esperando por este e-mail.

mvp16

Gostaria de agradecer a todos da comunidade tecnica a quem dedico esta renovação do título…

Conforme disse no ano passo, estou escalando a minha montanha:), tenho grandes amigos que são MVPs há 10 anos e espero chegar la um dia:), estou no meu 3º ano e espero atingir tantos quantos os meus amigos.

E aqui a foto atualizada de minha montanha que estou escalando…. já se foram 3 anos  … rumo aos 10 anos:)

OBS: Na barrinha de progressão do Windows já foi 30%:)

image1

Obrigado a todos e especial ao programa  MVP e a comunidade tecnica no geral.